Re: [development] DB support as contribute: is it a good idea?

14 Feb 2008

      On Thu, 14 Feb 2008 15:27:25 +0800
Edison Wong <hswong3i@edin.no-ip.com> wrote:
...
Moreover, I would like recommend Moodle XMLDB implementation as a
solid example
(http://docs.moodle.org/en/Development:XMLDB_introduction).
The idea of my proposal is quite similar as that,
1. database.*.inc: similar as ADOdb
2. common.*.inc: similar as Moodle DML Library 
(http://docs.moodle.org/en/DML_functions)
3. schema.*.inc: similar as Moodle DDL Library 
(http://docs.moodle.org/en/DDL_functions)
That should be no conflict with chx and crell's work: Data API
should be something belongs to Drupal core API (similar idea as
Moodle SQL neutral statements) which is much higher level than that
of my research (similar as ADOdb). They should be split but not mix
together :-)
I've been thinking about which level of abstraction is suited for
drupal a bit.
I came out with something like what Larry wrote (not as elegant
because queries are structured array and to "compose" you add
elements to array, you don't queue methods, but I was in a
hurry...) just for MS SQL (the scope was letting a door open to
transfer the stuff currently in MS SQL to pg that is already running
drupal).
It's something that can be carried quite far but I'm
wondering if it reach the sweet spot in abstraction/performance.

Django uses a 3rd party full ORM, but Django has never been a CMS, it
always has been a framework.
Still Drupal has been a CMS but it is going in the framework
direction.

Pushing in anything like SQLAlchemy will solve the problem of DB
abstraction at the apparent cost of bloat (apparent since it has to
be proved that adopting an ORM has such a big impact on performance
once you go the multi-db support path...).
Anyway hiding the DB too much and putting both feet in the OO realm
may have its drawback too.

Anyway... Drupal as a framework doesn't have the same wide area of
application as Django and core has very well defined objects (I think
there are discussions about redefining core objects too, cck, node,
field...). It has to be seen if pushing custom DB support into core
objects (providing some methods tailored on DB capabilities to
access/modify core drupal object) is worth.

I faced situations where doing something really exploiting the
capabilities of a DB do make the difference... but still if all the
things core have to offer to interact with core objects can be
achieved efficiently with Larry's approach, we could delay the
problem of an even more abstract DB layer some more version.

I think that at this moment having a system that let you interact
with core object efficiently and let you write DB agnostic code 90%
of the time (or avoid you to write SQL code at all) will kick out of
core and contrib *nearly* all the mysqlism that are a pain.
90% of the time people shouldn't feel the need to write SQL
directly and access functions similar to what Larry wrote or just use
core object methods.

The structure of Larry's query object should be examined carefully to
see if it makes hard to implement stuff like updates with join or
other things (subquery) that may not be so popular but still popular
enough to convince people to skip the whole DB AL and again write SQL
directly.
If a module uses 90% of supported queries and 10% of unsupported
queries... it will still be a hard module to port.
If 90% of the modules use just supported queries, it be a better
situation.
This will determine the success of a new AL.

For the remaining 10% of modules that see a real advantage
of exploiting DB specific queries core should provide a path to put
SQL specific queries in "module-contrib" sort of... so that if the
original author wrote it for a specific DB and someone is interested
in porting it to other DB it won't be a pain to do so.
If every module contrib invent its method to isolate DB specific
code, no one will be so brave to port modules to other DB.

Another thing to consider is that Drupal is a web framework... you'll
have to deal with form, menu, json, xml etc... there are a lot of
repetitive tasks that could be avoided exploiting a bit more metadata
put into the schema api (or elsewhere).
People already wrote about it... but eg. if we are going to support
newer version of DB... checks comes into play, we have constraint
etc... and even avoiding a copy of RoR, you can still automate a lot
of stuff starting from validation if you use metadata into schema.

Abstraction and delegation were separated at birth... and everyone
know that you can solve all CS problems with recursive use of
delegation (including asking Karoly to write pg drivers <g>)... but
still Drupal can't fulfil all needs of all developers in all fields
and even as a framework I'd keep the "web site" use case on top of
priorities.
That means dealing with forms and all the above... if we're not going
to have a full ORM it is worth to offer some helpers that will avoid
to write over and over the same code for common tasks taking
advantage of schema metadata or even renouncing to some higher level
abstraction to put in the schema data some info that won't be
strictly related to the DB but will make much easier to build up web
apps (auto-completion?, nested select (pk/fk) with JS/JQuery?, you
name them...).

BTW Larry, I've found that adding group by, order by info directly in
the fields "property" avoid a lot of typing and keeps info under
sight, my scenario didn't push this to much so there may be
limitations to this approach... but still... it worked for me.

At a first sight moodle functions just look inspirational but the
feeling is they give too much freedom to the coder to put non
cross-compatible SQL inside functions.
Other inspirational sources for DB AL could be:
http://propel.phpdb.org/trac/ propel
http://www.phpdoctrine.org/ doctrine
http://creole.phpdb.org/trac/ creole
ADOdb now has its own Active Record too...
I think there are others...

I'd be very interested in which problems you had to face to succeed
in porting drupal to so many DB.

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it