DB support as contribute: is it a good idea?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folk, Drupal 6.0 is revamped with Schema API, so what's next for Drupal 7.x? PDO for sure! With this powerful data-access abstraction layer, workload will much reduced for DB abstraction layer designers and developers, and finally benefit our contribute developers and end users. By the way, together with the decision of Drupal 7.x + PDO, there is also some voice about moving PostgreSQL (and so other potential databases support, e.g. Oracle, DB2, MSSQL, etc) support away from core, but contribute; on the other hand, add official SQLite support into Drupal core, together with MySQL. Is this really a good idea? Or even if it is possible? As an existing Drupal + PostgreSQL users, what will this affect your daily work? As a potential customer of Drupal + Oracle/DB2/MSSQL/etc, is this a good new for you, or just an evil? I would like to provide some brief idea for you within this article. Please refer to http://groups.drupal.org/node/8855 for more detail information. - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHsq/PBPIQaq+ZRd8RAk+JAKCLgvGCnf8DZvm8fPkA3dYk5tOmuQCgly7J eJrcfdBQOrSeM7o5ET/WmHg= =uZmG -----END PGP SIGNATURE-----
On Wed, 13 Feb 2008 16:52:32 +0800 Edison Wong <hswong3i@edin.no-ip.com> wrote:
By the way, together with the decision of Drupal 7.x + PDO, there is also some voice about moving PostgreSQL (and so other potential databases support, e.g. Oracle, DB2, MSSQL, etc) support away from core, but contribute; on the other hand, add official SQLite support into Drupal core, together with MySQL.
Perfect... put support for any other DB other than MySQL in contrib *but* provide a serious abstraction layer that let people write contrib to support *more than 2* DB ;) That just move the problem from providing an API to providing an SPI, making it worse. Once you decide to go the SPI way... you've to design it in a way that anyone can add support for his beloved DB. You're not in charge of what DB will be plugged nor how. Surely you won't have to develop the plug-in for each DB... but you'll have to provide support without knowing what will get in those plug-ins. Everyone that is serious about DB AL should know that a) it's not easy b) it is a matter of compromises. If you don't compromise (performance, # of supported DB, delegation, level of abstraction, extensibility, schedule...), you don't get the DB AL. Writing an SPI has its pros and cons. Pros: MySQL fans won't encumber core with MySQLism that sometimes make it hard to support other DB. PG fans then will have to write their own plug-in, and if they don't it is up to their business, and if they do, they will do it in an independent way from MySQL people *and* with a *serious* SPI most of the time they will have to cut&paste code from the MySQL plug-in and rewrite few lines. Cons: It is a harder work. It will have to be even more abstract and radical. We will have to extrude current support for the DBs and turn it into plug-ins. Unless we want to see a lot of code duplication we will still have to write a basic API and the plug-ins will override just few methods. So... anyway before we get to an SPI that is worth to have we will have to pass through a better API. So... let's work for putting pg support in contrib for drupal 9. -- Ivan Sergio Borgonovo http://www.webthatworks.it
I recommend my blog post http://drupal4hu.com/node/123 and the linked sandbox http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/crell/pdo/ instead of FUD and confusion. And yes, it is a very good idea to move db drivers into contrib -- provided that core is capable of proper support them. This is what Crell and I are working on.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It is very simple to comment others idea as FUD without responsibility, but that is totally another story if works is supported by indeed research progress. I have upload my research progress to sandbox (http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/hswong3i/siren/), a patch for Official Drupal 6.0. By applying the patch, we are able to support totally 9 PHP drivers (including mysql, mysqli, pgsql, oci8, pdo_mysql, pdo_pgsql, pdo_oci, pdo_sqlite and pdo_ibm) within single core code base. Progresses are also split into number of small patches (http://groups.drupal.org/node/8663), and ready for review (or even commit). I dig into this field for more than year, and I am also willing to keep it on going, for both MySQL/PostgreSQL/Oracle/SQLite/DB2/etc. Please provide some solid research progress (but not code snippet in brain storming level) to support your FUD comment, or else it is totally not fair. Regards Karoly Negyesi wrote: | I recommend my blog post http://drupal4hu.com/node/123 and the linked sandbox http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/crell/pdo/ instead of FUD and confusion. And yes, it is a very good idea to move db drivers into contrib -- provided that core is capable of proper support them. This is what Crell and I are working on. - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHs8XQBPIQaq+ZRd8RAjR/AKCwzFKihvTTESqO6wSKN/QJBo7xtwCfdsd9 UkFio8H8P16by5vf7/DXdEo= =dPWz -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Moreover, I would like recommend Moodle XMLDB implementation as a solid example (http://docs.moodle.org/en/Development:XMLDB_introduction). The idea of my proposal is quite similar as that, 1. database.*.inc: similar as ADOdb 2. common.*.inc: similar as Moodle DML Library (http://docs.moodle.org/en/DML_functions) 3. schema.*.inc: similar as Moodle DDL Library (http://docs.moodle.org/en/DDL_functions) That should be no conflict with chx and crell's work: Data API should be something belongs to Drupal core API (similar idea as Moodle SQL neutral statements) which is much higher level than that of my research (similar as ADOdb). They should be split but not mix together :-) - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHs+1dBPIQaq+ZRd8RAmzuAKCQJgZWC8XmxY9+nyq3rkxW35L46ACfbDY7 4Gi6s/k9wnJI8Dw7dlF15YA= =A9n0 -----END PGP SIGNATURE-----
On Thu, 14 Feb 2008 15:27:25 +0800 Edison Wong <hswong3i@edin.no-ip.com> wrote:
Moreover, I would like recommend Moodle XMLDB implementation as a solid example (http://docs.moodle.org/en/Development:XMLDB_introduction).
The idea of my proposal is quite similar as that, 1. database.*.inc: similar as ADOdb 2. common.*.inc: similar as Moodle DML Library (http://docs.moodle.org/en/DML_functions) 3. schema.*.inc: similar as Moodle DDL Library (http://docs.moodle.org/en/DDL_functions)
That should be no conflict with chx and crell's work: Data API should be something belongs to Drupal core API (similar idea as Moodle SQL neutral statements) which is much higher level than that of my research (similar as ADOdb). They should be split but not mix together :-)
I've been thinking about which level of abstraction is suited for drupal a bit. I came out with something like what Larry wrote (not as elegant because queries are structured array and to "compose" you add elements to array, you don't queue methods, but I was in a hurry...) just for MS SQL (the scope was letting a door open to transfer the stuff currently in MS SQL to pg that is already running drupal). It's something that can be carried quite far but I'm wondering if it reach the sweet spot in abstraction/performance. Django uses a 3rd party full ORM, but Django has never been a CMS, it always has been a framework. Still Drupal has been a CMS but it is going in the framework direction. Pushing in anything like SQLAlchemy will solve the problem of DB abstraction at the apparent cost of bloat (apparent since it has to be proved that adopting an ORM has such a big impact on performance once you go the multi-db support path...). Anyway hiding the DB too much and putting both feet in the OO realm may have its drawback too. Anyway... Drupal as a framework doesn't have the same wide area of application as Django and core has very well defined objects (I think there are discussions about redefining core objects too, cck, node, field...). It has to be seen if pushing custom DB support into core objects (providing some methods tailored on DB capabilities to access/modify core drupal object) is worth. I faced situations where doing something really exploiting the capabilities of a DB do make the difference... but still if all the things core have to offer to interact with core objects can be achieved efficiently with Larry's approach, we could delay the problem of an even more abstract DB layer some more version. I think that at this moment having a system that let you interact with core object efficiently and let you write DB agnostic code 90% of the time (or avoid you to write SQL code at all) will kick out of core and contrib *nearly* all the mysqlism that are a pain. 90% of the time people shouldn't feel the need to write SQL directly and access functions similar to what Larry wrote or just use core object methods. The structure of Larry's query object should be examined carefully to see if it makes hard to implement stuff like updates with join or other things (subquery) that may not be so popular but still popular enough to convince people to skip the whole DB AL and again write SQL directly. If a module uses 90% of supported queries and 10% of unsupported queries... it will still be a hard module to port. If 90% of the modules use just supported queries, it be a better situation. This will determine the success of a new AL. For the remaining 10% of modules that see a real advantage of exploiting DB specific queries core should provide a path to put SQL specific queries in "module-contrib" sort of... so that if the original author wrote it for a specific DB and someone is interested in porting it to other DB it won't be a pain to do so. If every module contrib invent its method to isolate DB specific code, no one will be so brave to port modules to other DB. Another thing to consider is that Drupal is a web framework... you'll have to deal with form, menu, json, xml etc... there are a lot of repetitive tasks that could be avoided exploiting a bit more metadata put into the schema api (or elsewhere). People already wrote about it... but eg. if we are going to support newer version of DB... checks comes into play, we have constraint etc... and even avoiding a copy of RoR, you can still automate a lot of stuff starting from validation if you use metadata into schema. Abstraction and delegation were separated at birth... and everyone know that you can solve all CS problems with recursive use of delegation (including asking Karoly to write pg drivers <g>)... but still Drupal can't fulfil all needs of all developers in all fields and even as a framework I'd keep the "web site" use case on top of priorities. That means dealing with forms and all the above... if we're not going to have a full ORM it is worth to offer some helpers that will avoid to write over and over the same code for common tasks taking advantage of schema metadata or even renouncing to some higher level abstraction to put in the schema data some info that won't be strictly related to the DB but will make much easier to build up web apps (auto-completion?, nested select (pk/fk) with JS/JQuery?, you name them...). BTW Larry, I've found that adding group by, order by info directly in the fields "property" avoid a lot of typing and keeps info under sight, my scenario didn't push this to much so there may be limitations to this approach... but still... it worked for me. At a first sight moodle functions just look inspirational but the feeling is they give too much freedom to the coder to put non cross-compatible SQL inside functions. Other inspirational sources for DB AL could be: http://propel.phpdb.org/trac/ propel http://www.phpdoctrine.org/ doctrine http://creole.phpdb.org/trac/ creole ADOdb now has its own Active Record too... I think there are others... I'd be very interested in which problems you had to face to succeed in porting drupal to so many DB. -- Ivan Sergio Borgonovo http://www.webthatworks.it
delegation (including asking Karoly to write pg drivers <g>)... but
$db = new PDO('pgsql:dbname=drupal', 'drupal', 'drupal'); $binarydata = "abcdefg\x00a\x00\x01\x02"; $db->exec('CREATE TABLE test (data bytea, comment varchar(64), len integer)'); $db->beginTransaction(); $stmt = $db->prepare("INSERT INTO test (data, comment, len) VALUES (:data, :comment, :len)"); $stmt->bindParam(':len', $len); $stmt->bindParam(':data', $blob, PDO::PARAM_LOB); $blob = fopen('php://memory', 'a'); $len = fwrite($blob, $binarydata); rewind($blob); $comment = 'lob'; $stmt->bindParam(':comment', $comment); $stmt->execute(); $db->commit(); Like that? I *did* write the skeleton of LOB handling for postgresql, mind ya.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hmm... From my point of view, it is just similar as http://netevil.org/uuid/4365876a-cee9-3009-7726-365876a51802 (although it is target for pdo_oci, it share the same idea as pdo_pgsql): <?php $db = new PDO("oci:", "scott", "tiger"); $db->beginTransaction(); // Essential! $stmt = $db->prepare( ~ "INSERT INTO blobtest (id, contenttype, blob) ". ~ "VALUES (:id, :type, EMPTY_BLOB()) ". ~ "RETURNING blob INTO :blob"); $stmt->bindParam(':id', $id); $stmt->bindParam(':type', $type); $stmt->bindParam(':blob', $blob, PDO::PARAM_LOB); $type = 'image/gif'; $id = 1; // generate your own unique id here $blob = fopen('/path/to/a/graphic.gif', 'rb'); $stmt->execute(); $stmt->commit(); ?> http://php.net/pdo#pdo.lobs also come with some example (BTW, this level of code snippet is TOTALLY NOT ENOUGH for a Drupal-style db_query() implementation...): <?php $db = new PDO('odbc:SAMPLE', 'db2inst1', 'ibmdb2'); $stmt = $db->prepare("insert into images (id, contenttype, imagedata) values (?, ?, ?)"); $id = get_new_id(); // some function to allocate a new ID // assume that we are running as part of a file upload form // You can find more information in the PHP documentation $fp = fopen($_FILES['file']['tmp_name'], 'rb'); $stmt->bindParam(1, $id); $stmt->bindParam(2, $_FILES['file']['type']); $stmt->bindParam(3, $fp, PDO::PARAM_LOB); $db->beginTransaction(); $stmt->execute(); $db->commit(); ?> On the other hand, I get a functional implementation of pdo_pgsql *done* with correct PDO + PostgreSQL BLOB handling, based on our existing D6 DB API with minimal amount of logic change: http://edin.no-ip.com/viewvc/siren/includes/database.pdo_pgsql.inc?view=mark... http://edin.no-ip.com/viewvc/siren/includes/common.postgresql.inc?view=marku... http://edin.no-ip.com/viewvc/siren/includes/schema.postgresql.inc?view=marku... http://edin.no-ip.com/viewvc/siren/includes/install.postgresql.inc?view=mark... Like this one? Don't forget that they are parallel developed with pdo_mysql, pdo_oci, pdo_sqlite, and pdo_ibm, where all of the other are both function correctly. I am not trying to trigger a war of implementation; BTW, a tiny amount brain-storming-level code snippet (which similar implementation is all around the world...) is not enough for supporting a positive technical-based discussion ;-( Karoly Negyesi wrote: |> delegation (including asking Karoly to write pg drivers <g>)... but | | $db = new PDO('pgsql:dbname=drupal', 'drupal', 'drupal'); | $binarydata = "abcdefg\x00a\x00\x01\x02"; | $db->exec('CREATE TABLE test (data bytea, comment varchar(64), len integer)'); | $db->beginTransaction(); | $stmt = $db->prepare("INSERT INTO test (data, comment, len) VALUES (:data, :comment, :len)"); | $stmt->bindParam(':len', $len); | $stmt->bindParam(':data', $blob, PDO::PARAM_LOB); | $blob = fopen('php://memory', 'a'); | $len = fwrite($blob, $binarydata); | rewind($blob); | $comment = 'lob'; | $stmt->bindParam(':comment', $comment); | $stmt->execute(); | $db->commit(); | | Like that? I *did* write the skeleton of LOB handling for postgresql, mind ya. - -- Edison Wong hswong3i@gmail.com http://edin.no-ip.com/html/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHuOU7BPIQaq+ZRd8RAivRAKCW0wbmUokTo8nPl/21GONyJfNUzACfbOKg WxdXzpYkBAsr4aRlsETOsk4= =2jiO -----END PGP SIGNATURE-----
participants (3)
-
Edison Wong -
Ivan Sergio Borgonovo -
Karoly Negyesi