Nov 5, 2012

AOP, yeah you know me. And some postgresql/windows context issues.

AOP

Unified BPM vill make heavy use of Aspect Oriented Programming.
Why? Because if there ever was a system that would benefit from centralizing cross-cutting concerns, it is this one.
There is basically none of the typical uses for AOP that aren't present everywhere in Unified BPM: Security, Logging, Caching(memoization), Syncronization, Connection management, Proxification, Run-time debugging, tracing, statistics and basically whatever you could think of, they are all present here.

Also, the support for it in Python is positively excellent and simple to use. Especially using the "decorator" module.


Postgresql-client vs. Windows USERNAME

So, suddenly Hudson started reporting that the Win XP-postgresql tests failed.
I also realized that they only failed while being remotely run by the Hudson test runner.
I just could not get my head around it, after a while I managed to find out it was getpass that tried to load the pwd-module. Which of course doesn't exist in the Windows environment.
I had no idea why, though. It really had no use for that data. The only reason was that it used it to populate some defaults when it couldn't find the USERNAME environmental variable.
The thing was, that that variable was available when I ran it as the same user from a logged in shell. But it seemed it wasn't when being run by Hudson.

It took me a while to come to the conclusion that my only option was to manually set the USERNAME variable as a parameter in Hudson. I did, and it worked.

It seems that the USERNAME environment variable isn't set by Windows, but by the process creator. Or something to that effect.


Sep 2, 2012

Database editor and structure

Hi,

Some new stuff, have not had much time the last year. Most importantly though, the genesis of the UBPM database structure builds on all platforms(postgres, mySQL, SQL server, DB2 and Oracle).

Editor for XML SQL structures:

So first, I have created a web based editor for the XML SQL structures. Behold.


It resides in the Q_N_D-folder, which means that it is quick and dirty, something built to be able to move forward. However, it had to have some..

Features:
  • Ability to store(and load) the resulting data files centrally(preferrably within the source tree).
  • Functionality to build upgrade files for the upgrades.
  • It had to be easy to also make stand-alone statements.
Just run controller.py and open up http://127.0.0.1:8080 to watch it in action.
It uses cherryPy for the backend and Genshi for templating, a really nice and lightweight framework indeed.


The database


Using that, I have finally stumbled into what for me is far more familiar ground, database design.
I have created an upgrade script that creates the first parts of the Unified BPM database structure(stuff like a tree, rights, users, languages).


The database will store the persistent structural data in a central tree.
Its meta data will be stored in one of three different ways, depending on what suits the specific entity considering simplicity, flexibility and performance.

The data models:


  • Relational:
    This is the usual way people store data, with sub tables detailing the nodes in the tree.
  • EAV(Entity-Attribute-Value):
    Instead of having many subtables, everything is stored in one table. If one doesn't want to use variants, one uses one table per data type. In reality, however, that doesn't matter as much, as database engines nowadays store data in a far more intelligent manner, so it could actually be SLOWER to NOT use variants. How about that, strongly typing advocates?
  • XML:
    Actually similar to to EAV in many modern database implementations, where XML is its own data type which can be queried just like a table structure. However, UBPM will not use any these implementations however cool they are. UBPM will only store the XML as strings.
    The reason is that 1) all those implementations are proprietary and would be hell to generalize and 2) that it is not the point. XML data will be used to allow for freely structured storage...that is somewhat structured. If you know what I mean. An example would be a 3rd party kind of data. 

In conclusion


It now seems that most of the surprisingly large technical hurdles in dealing with multiple database platforms has been negotiated. Now it is just a matter of expanding the support for more of the SQL standard. But the next in line, like ALTER TABLE and UPDATE are quite easy and contains the same elements as the already implemented.
Also, my self-educational goals up till now has been met as well, I have learned quite a lot.
From here on, if feels like it should be a far more straight forward matter of comparatively simply building the system. I suspect the multi-platform deployment will create problems, however I will simply not try to generalize between windows and linux installations. It will basically be completely different solutions.

Nov 23, 2011

Success!

Hi everybody!

Now I have finally succeeded in making the following work:

  • XML to object structure

  • Object structure to XML (the tests goes both ways)


  • The following structures are currently supported:

    • SELECT

      • Operators/Identifiers/Aliases
      • Function calls
      • TOP/LIMIT
      • CASE..WHEN
      • JOIN
      • WHERE
      • ORDER BY



    • INSERT INTO (from SELECT)

    • CREATE TABLE

      • Constraints(PK/FK)
      • Indexes
      • Default values
      • Auto increment. On Oracle using triggers, thanks for that pointless inconvenience, Oracle :-(



    • Custom

      • The "custom" structure simply take an SQL statement for each backend.

        It is intended for those cases when very database-specific features are needed.


  • The db_upgrader library, that uses the DAL to upgrade databases, outside of performing the upgrades from an XML-definition, now tracks applications and versions. Cewl stuff.


Basically, those three statements are enough to build most things.
There is usually no long-term gain into making a database much more complex than so.
I'd say you solve 95 % of your database creation needs with that, and what it does not solve..well perhaps it should not be in the database layer.



These structures are all tested as integration tests that are run on both Windows XP and Debian(unstable) that, using the db_upgrader library creates an example database on the following platforms:

  • Postgresql(on Debian stable)

  • MySQL(on Debian stable)

  • MSSQL server Express(on Windows XP)

  • IBM DB2 Express-C (on Debian stable)

  • Oracle 11g XE (on Fedora 16)




I can say one thing, it sure wasn't easy.

I basically had these requirements:
  1. The Database Abstraction Layer(DAL) should be able to run on both Windows and Linux and seamlessly be able to do the same thing on PostgreSQL, MySQL, MSSQL, DB2 and Oracle.
  2. The installation of the clients must be simple(server:port, database, user, password) , and therefore any installation procedures need to be scriptable. A user should not need to edit a million config files, set environment variables or download lots of files to make it work.
  3. The connection to the database backends should not need lots of tweaking, it should "just work". I did not mind if some performance had to be sacrificed. According to my experience after 10 years of database development, it is usually not the driver that is the show stopping performance bottleneck, but indexing, lack of proper database engine internals and, of course bad database design.
I almost met all of them. The only one were I didn't make it all the way was point 2.
And actually, I think I can semi-automate or at least wildly simplify those processes if the worst client-side hassle offenders, IBM DB2 and Oracle 11g, is OK with it.

Configuring the back end servers was a real pain though, especially Oracle 11g.
Don't try install that on Debian/Ubuntu, folks. Go with Fedora for Oracle 11g and you'll live longer.
Luckily, I am done with that now. :-)

On to moving the whole thing stuff into public display on the Sourceforge GIT. Or perhaps somewhere else, Google code? Ideas?

Anyway, I will now set up some CI-testing solution around my servers and clients. That'd be great.

And then it is on to some more installation(.deb/.msi files) stuff and then the fun part. Adding features.

Sep 18, 2011

XML to structure and back again.

Hi again.

After a year long pause, I have started with this again.
And so I now have my first test of the XML-to-SQL objects-to-XML.
And it works to both parse and generate it back from the object structure and getting a file identical to the first.

The XML looks like this:

<?xml version="1.0" ?>
<sql:Statement xmlns:sql="http://www.unifiedbpm.se/XMLschema/DAL/SQL"
xmlns:xsi="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="http://www.unifiedbpm.se/XMLschema/DAL/SQL ../SQL.xsd">
<sql:Verb_CREATE_TABLE>
<sql:Name>Table1</sql:Name>
<sql:Constraints>
<sql:Parameter_Constraint>
<sql:CheckConditions />
<sql:Constraint_Type>PRIMARY KEY</sql:Constraint_Type>
<sql:References>
<sql:str>Table1ID</sql:str>
</sql:References>
<sql:Name>PK_Table1_Table1ID</sql:Name>
</sql:Parameter_Constraint>
<sql:Parameter_Constraint>
<sql:CheckConditions />
<sql:Constraint_Type>FOREIGN KEY</sql:Constraint_Type>
<sql:References>
<sql:str>Table1ID</sql:str>
<sql:str>Table2</sql:str>
<sql:str>Table2ID</sql:str>
</sql:References>
<sql:Name>FK_Table1_Table1ID_Table2_Table2ID</sql:Name>
</sql:Parameter_Constraint>
<sql:Parameter_Constraint>
<sql:CheckConditions>
<sql:Parameter_Conditions>
<sql:Conditions>
<sql:Parameter_Condition>
<sql:Operator>></sql:Operator>
<sql:ANDOR>AND</sql:ANDOR>
<sql:Right>
<sql:Parameter_Numeric>
<sql:Numeric_Value>2.4</sql:Numeric_Value>
</sql:Parameter_Numeric>
</sql:Right>
<sql:Left>
<sql:Parameter_Numeric>
<sql:Numeric_Value>1.3</sql:Numeric_Value>
</sql:Parameter_Numeric>
</sql:Left>
</sql:Parameter_Condition>
<sql:Parameter_Condition>
<sql:Operator>LIKE</sql:Operator>
<sql:ANDOR>AND</sql:ANDOR>
<sql:Right>
<sql:Parameter_String>
<sql:String_Value>%</sql:String_Value>
</sql:Parameter_String>
</sql:Right>
<sql:Left>
<sql:Parameter_Identifier>
<sql:Identifier>T1.firstname</sql:Identifier>
</sql:Parameter_Identifier>
</sql:Left>
</sql:Parameter_Condition>
</sql:Conditions>
</sql:Parameter_Conditions>
</sql:CheckConditions>
<sql:Constraint_Type>CHECK</sql:Constraint_Type>
<sql:References>
<sql:str>Table1ID</sql:str>
</sql:References>
<sql:Name>CK_Table1_Name</sql:Name>
</sql:Parameter_Constraint>
<sql:Parameter_Constraint>
<sql:CheckConditions />
<sql:Constraint_Type>UNIQUE</sql:Constraint_Type>
<sql:References>
<sql:str>Table1ID</sql:str>
</sql:References>
<sql:Name>UQ_Table1_Name</sql:Name>
</sql:Parameter_Constraint>
</sql:Constraints>
<sql:Columns>
<sql:Parameter_ColumnDefinition>
<sql:Datatype>serial</sql:Datatype>
<sql:Default />
<sql:NOTNULL>True</sql:NOTNULL>
<sql:Name>Table1ID</sql:Name>
</sql:Parameter_ColumnDefinition>
<sql:Parameter_ColumnDefinition>
<sql:Datatype>string(400)</sql:Datatype>
<sql:Default />
<sql:NOTNULL>False</sql:NOTNULL>
<sql:Name>Table1Name</sql:Name>
</sql:Parameter_ColumnDefinition>
<sql:Parameter_ColumnDefinition>
<sql:Datatype>timestamp</sql:Datatype>
<sql:Default />
<sql:NOTNULL>False</sql:NOTNULL>
<sql:Name>Table1Changed</sql:Name>
</sql:Parameter_ColumnDefinition>
</sql:Columns>
<sql:postSQL />
</sql:Verb_CREATE_TABLE>
</sql:Statement>



As you see, it is basically an abstraction of the SQL language.
It should be pretty easy to write a GUI for generating the XML that would be similar to actually writing SQL.
With that as input, the following SQLs are generated when calling any of the SQL objects' AsSQL-method(note the oracle syntax):


SQL for the MySQL database:

CREATE TABLE Table1 (
`Table1ID` INTEGER AUTO_INCREMENT NOT NULL,
`Table1Name` VARCHAR(400),
`Table1Changed` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT `PK_Table1_Table1ID` PRIMARY KEY (Table1ID),
CONSTRAINT `FK_Table1_Table1ID_Table2_Table2ID` FOREIGN KEY (Table1ID) REFERENCES Table2(Table2ID),
CONSTRAINT `CK_Table1_Name` CHECK ((1.3 > 2.4) AND (T1.firstname LIKE '%ercedes')),
CONSTRAINT `UQ_Table1_Name` UNIQUE (Table1ID)
)


SQL for the PostgreSQL database:



CREATE TABLE Table1 (
"Table1ID" serial NOT NULL,
"Table1Name" VARCHAR(400),
"Table1Changed" DATETIME DEFAULT(NOW()),
CONSTRAINT "PK_Table1_Table1ID" PRIMARY KEY (Table1ID),
CONSTRAINT "FK_Table1_Table1ID_Table2_Table2ID" FOREIGN KEY (Table1ID) REFERENCES Table2(Table2ID),
CONSTRAINT "CK_Table1_Name" CHECK ((1.3 > 2.4) AND (T1.firstname ILIKE '%
ercedes')),
CONSTRAINT "UQ_Table1_Name" UNIQUE (Table1ID)
)


SQL for the Oracle database:



CREATE TABLE Table1 (
"Table1ID" integer NOT NULL,
"Table1Name" VARCHAR2(400),
"Table1Changed" DATETIME DEFAULT(NOW()),
CONSTRAINT "PK_Table1_Table1ID" PRIMARY KEY (Table1ID),
CONSTRAINT "FK_Table1_Table1ID_Table2_Table2ID" FOREIGN KEY (Table1ID) REFERENCES Table2(Table2ID),
CONSTRAINT "CK_Table1_Name" CHECK ((1.3 > 2.4) AND (T1.firstname LIKE '%
ercedes' escape 'C')),
CONSTRAINT "UQ_Table1_Name" UNIQUE (Table1ID)
)
CREATE SEQUENCE seq_Table1_Table1ID_DAL_auto_increment
start with 1
increment by 1
nomaxvalue;
CREATE TRIGGER tr_Table1_Table1ID_DAL_auto_increment
BEFORE INSERT ON Table1 FOR EACH ROW BEGIN
SELECT seq_Table1_Table1ID_DAL_auto_increment.nextval INTO :new.id FROM dual;
end;


SQL for the DB2 database:



CREATE TABLE Table1 (
"Table1ID" INT GENERATED ALWAYS AS IDENTITY NOT NULL,
"Table1Name" VARCHAR(400),
"Table1Changed" TIMESTAMP,
CONSTRAINT "PK_Table1_Table1ID" PRIMARY KEY (Table1ID),
CONSTRAINT "FK_Table1_Table1ID_Table2_Table2ID" FOREIGN KEY (Table1ID) REFERENCES Table2(Table2ID),
CONSTRAINT "CK_Table1_Name" CHECK ((1.3 > 2.4) AND (T1.firstname LIKE '%
ercedes')),
CONSTRAINT "UQ_Table1_Name" UNIQUE (Table1ID)
)


SQL for the SQLserver database:



CREATE TABLE Table1 (
[Table1ID] int IDENTITY(1,1) NOT NULL,
[Table1Name] varchar(400),
[Table1Changed] DATETIME DEFAULT(NOW()),
CONSTRAINT [PK_Table1_Table1ID] PRIMARY KEY (Table1ID),
CONSTRAINT [FK_Table1_Table1ID_Table2_Table2ID] FOREIGN KEY (Table1ID) REFERENCES Table2(Table2ID),
CONSTRAINT [CK_Table1_Name] CHECK ((1.3 > 2.4) AND (T1.firstname LIKE '%
ercedes')),
CONSTRAINT [UQ_Table1_Name] UNIQUE (Table1ID)
)


Small disclaimer: I haven't run these scripts on all database architectures, am currently rebuilding the testing environment.

Oct 26, 2010

XML, back and forth and here again.

Hi,
Lately i've busy creating, validating and parsing XML.

There are now two schemas used when defining upgrade files:
http://www.unifiedbpm.se/XMLschema/DAL/SQL.xsd
http://www.unifiedbpm.se/XMLschema/DBUpgrader/DBUpgrader.xsd

The first defines the, so far supported, SQL, and the second defines the DBUpgraders upgrade file.

As always, I had less time and it took more effort than I thought, but now I am closing in on having a working upgrader.

It is pretty cool to watch integration tests run that creates the same structure on five different database platforms, inserts data and selects back and actually works.
A bit left to do, but after that, this might actually be a useful functionality.

That's all folks, bye!

Sep 29, 2010

Road map.

Hi,

SO FAR

I've gotten to a point where I have proven and implemented the two most important technical concepts needed for the development of Unified BPM :
1. Database abstraction layer
2. Database upgrade management including an (almost finished) XML schema for upgrade files. The XML schema is to allow for easy upgrade client development, one should not have to dabble with XML file editing. Interesting point is that the code generates the schema mostly by itself, based on its functionality, allowing for simpler maintenance.

This functionality extends to the MySQL, PostgreSQL, DB2, Oracle and SQL server platforms, except for upgrade management, where Oracle is partially unsupported due to its aforementioned lack of auto increment shortcuts. Maybe someone knows a way around that problem?

ROAD MAP


At this stage, then, I am in the really exciting position of being able to present a short road map. It is a seriously pragmatic one, one that is geared at getting something basic working as soon as possible:

1. Basic database abstraction/upgrade management. Creating tables, selecting and inserting data. I won't do anything more advanced with this unless other parts of UBPM needs more functionality. In my experience and according to my expectation, needing advanced SQL statements in the beginning of a project is mostly due to poor database design. Also, I am consciously keeping the database as simple as possible at this stage. I am nearly done with this stage. I have some structural stuff left on the database upgrader, but then I will move on.

2. The next step will be the messaging system of UBPM. I will begin with deciding the basic structure of communication within UBPM. I have already figured most of it out, but now I need to scale that down to its bare necessities in order to start somewhere.
Note that no logging of communications will be implemented in this stage.
The test for this will be to send a file from a listener on one host to another. One listener will run on a Linux host, and the other on a Windows host.

3. Not only files are to be sent, so next up will be to SELECT a dataset on one database server of one type, and INSERT the same data on an other of a different type and compare. This will put the DAL through its paces and probably expose any problems there.

4. By now, the DAL has probably been tested enough to enable me to start working on the Unified BPM database structure. Its primary purpose will be to log whatever happens on the system and that will be what I'll be implementing first. I will test this, and the database upgrade process by installing and upgrading the database and running the tests of points 2 and 3.
At this stage, I will be very happy, since I would now know that all the important concepts of UBPM works. As I see it, the largest technical hurdles are now negotiated and I will have a working integration test environment.


5. GUI. I start to develop a simple web based GUI that enables the generation of the XML database upgrade files. Because, from now on, there will be significantly more database development.
In general, it will be more straightforward development, and perhaps the perfect moment to invite others to join the project. It will certainly still be small enough to be easily influenced by new ideas and feedback.

Sep 14, 2010

Connection(s) established. Initiating execution.

Hi y'all.

So now the next phase is finished.
I now implemented and tested connections to all supported database platforms.
I only used multi-platform python-components.
These are:
  • PostgreSQL(psycopg2, native)
  • MySQL(MySQLdb, native)
  • SQL Server(pyodbc using FreeTDS)
  • DB2(pyodbc, needs libdb2.so)
  • Oracle(cx_Oracle, native, needs oracle instant client)
Getting all the databases running on my network was a real chore, to be honest.
Especially DB2 and Oracle were quite cumbersome to connect to, and needed lots of proprietary stuff installed on the client. Also, they don't really support debian/ubuntu.
However, now that I know how everything works I will try to automate client installations, or at least document them well, so Unified BPM users won't have to go through all the hassle I had to.

Anyway, next now is to define a small XML schema for generalizing of database upgrade management.
My thought is that the same XML should be able to manage database upgrades for multiple platforms, through the functionality of the database abstraction layer.
I don't want some Unified BPM components to end up having half-a$$ed support for specific databases, like it is for the Drupal project, for example.
Also, this makes it possible to make really kewl database tests.