[development] Do not let postgresql hold back great patches

Mon Dec 3 14:22:09 UTC 2007

On Mon, 03 Dec 2007 08:07:53 -0500
Earnie Boyd <earnie at users.sourceforge.net> wrote:

> Quoting Khalid Baheyeldin <kb at 2bits.com>:

> > 3. Do database specific stuff (including optimization) in the
> > abstraction layer (this is what Bill Moran advocates above).

> > I am for #3 as the way forward.

> Yes, it is the way to move forward.  Doing these things anywhere
> else just causes the frustration demonstrated by this thread.  We
> end up with one way to code the data selections, table creations,
> etc within everything but the Drupal SQL [1] abstraction engine.
> Most of it is already taken care of, we just need to formalize,
> tweak (read patch) and document.

There are plenty DB abstraction layers and ORM. I'd hate to redo the
work in a "not so smart way".

At the lowest level you can find things like PDO, adodb etc... with
no surprise at the higher level you can find things like SQLAlchemy,
Geniusql, storm all written in python.
PHP just recently start to resemble an OO language.

Stuff like ORM are too general beasts, to be fun to use they need a
stronger OO language and carry an overhead most people using Drupal
won't be happy to live with.

RoR has ActiveRecord but RoR stays at a lower level in the pyramid of
framework->CMS.

To increase abstraction at the lowest level we can just pick up PDO.
At a higher abstraction layer, as usual, it is important to guess
which is the most suitable level of abstraction and generalisation to
avoid to be stuck in the future while not degrading performances.

I'd provide more than one very thin layer of abstraction.
A higher level could be an ORM specialised on the most important
objects in Drupal (nodes, users, taxonomy...).
At a lower level an SQL rewriting engine that could be extended later
that will support the most popular SQL functions and operations
across different DB.
At the lowest level a way to directly write SQL with a default not
optimised route using the second level of abstraction and a specific
DB route for each DB if the developers are willing to optimise, but
in a more elegant way than writing switch.

As a side note without serialising, un-serialising, re-serialising,
the second level of abstraction could provide more security for SQL
injection since it will actually be a sort of SQL compiler rather
than a regexp filter.

I'd rely a lot on conventions to keep the code short and less
error prone.

Eg. if you're using group by, the returned fields should be in the
select statement and in the group by statement in pg.
If you pass fields names with a certain convention, fields will be
put in the group by statement by the compiler without the need to
list them twice and so on...

Similarly for the distinct statement.

Somehow we could even pre-compile SQL statement or save some cached
info (as count, I'm testing count caching across pages to speed up
paging on MS SQL).

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it