Re: [development] PDO and database limitations
Currently I'm doing an experimental Oracle port, and for that need I've done a quick travel in the drupal code... Frankly, I don't think PDO could be a solution for a better database portability, at least in drupal. From what I've seen, the database usage is pretty plain (some "create" in ".install" and mostly simple (only some table involved) "select", "insert" and "update" - or, better, "upsert" in the code). My first feeling was to create a very light db persistence layer, based on php dynamic object creation (__get and __set): define a class with an array on info like: name, type (number, text, large text and blob), is (part of) primary key, is indexable. Do a mapping into a "create table" should be easy. The actual common usage of "db_fetch_object" may be left more or less unchanged, simply creating queries asking what fields from what table with what filters. The "join" case can be simplifyed using views. About the db_num_rows question: there are more than a good reason to avoid it. As an example, the default behavior for the postgresql driver is to fetch in memory _all_ the results before returning the first one to the caller: with big tables this is simply devastating (and can be removed from future releases). And Oracle simply doesn't support it. My 2cents... Best Regards, Gabriele Turchi P.S.: I'm sorry, my english il alpha version...
Hi Gabriele. That sort of dynamic ORM is something I've been working on as well, but absolutely requires PHP 5. That means we can't really consider it for Drupal for another year at least. :-( PDO is *not* meant to be a complete DB abstraction; it's an API unification. I'm writing it as a new backend for Drupal as an alternative to mysql and pgsql APIs, because it should be faster than our own prepared statement support. Hopefully it will also make Oracle et al easier to support, as then there's only one set of PHP calls needed. Eventually, when PHP 4 goes away the separate DB backends can go away, too. And yes, db_num_rows() is now on my hit list. :-) On Saturday 07 April 2007 4:12 am, Gabriele Turchi wrote:
Currently I'm doing an experimental Oracle port, and for that need I've done a quick travel in the drupal code...
Frankly, I don't think PDO could be a solution for a better database portability, at least in drupal.
From what I've seen, the database usage is pretty plain (some "create" in ".install" and mostly simple (only some table involved) "select", "insert" and "update" - or, better, "upsert" in the code).
My first feeling was to create a very light db persistence layer, based on php dynamic object creation (__get and __set): define a class with an array on info like: name, type (number, text, large text and blob), is (part of) primary key, is indexable. Do a mapping into a "create table" should be easy.
The actual common usage of "db_fetch_object" may be left more or less unchanged, simply creating queries asking what fields from what table with what filters. The "join" case can be simplifyed using views.
About the db_num_rows question: there are more than a good reason to avoid it. As an example, the default behavior for the postgresql driver is to fetch in memory _all_ the results before returning the first one to the caller: with big tables this is simply devastating (and can be removed from future releases). And Oracle simply doesn't support it.
My 2cents...
Best Regards, Gabriele Turchi
P.S.: I'm sorry, my english il alpha version...
-- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
On 4/7/07, Gabriele Turchi <gabriele.turchi@l39a.com> wrote:
Frankly, I don't think PDO could be a solution for a better database portability, at least in drupal.
Can you explain in some more detail why you think so? My first feeling was to create a very light db persistence layer, based
on php dynamic object creation (__get and __set): define a class with an array on info like: name, type (number, text, large text and blob), is (part of) primary key, is indexable. Do a mapping into a "create table" should be easy.
I have mixed feelings about abstraction layers. I am not opposed to abstracting DDL (table creations, alters, ...etc.), but have mixed feelings on full abstraction of DML (select, insert, ...etc.), on one hand it makes development easier, but they tend to be lowest common denominator and failing to optimize to the specific engine in use. However, we already have an abstraction layer in Drupal that has worked well. The drawback is that it is not a standard though. If we are to write a new layer, then it has to be something with wide support for various databases, and PDO seems to be that layer. About the db_num_rows question: there are more than a good reason to
avoid it. As an example, the default behavior for the postgresql driver is to fetch in memory _all_ the results before returning the first one to the caller: with big tables this is simply devastating (and can be removed from future releases). And Oracle simply doesn't support it.
Many databases do not support, but for those that do, it avoids the performance penalty of two queries (one SELECT col1, col2 ...., and one SELECT COUNT(*)). In your other message, you mentioned that db_num_rows is bad for performance. How is that so? Is this true only for PostgreSQL? -- 2bits.com http://2bits.com Drupal development, customization and consulting.
Il giorno sab, 07/04/2007 alle 10.30 -0400, Khalid Baheyeldin ha scritto:
On 4/7/07, Gabriele Turchi <gabriele.turchi@l39a.com> wrote: Frankly, I don't think PDO could be a solution for a better database portability, at least in drupal.
Can you explain in some more detail why you think so?
Of course. From my point of view, PDO is simply a "API unification". But, if you have a little more complex target than creating simple tables and fetching data from them, you may need to specialize your queries for any specific database. To reach this objective, a way is to make a more general way to describe a query, and leave to a lower application lever the job to write the query, the better one for this database.
My first feeling was to create a very light db persistence layer, based on php dynamic object creation (__get and __set): define a class with an array on info like: name, type (number, text, large text and blob), is (part of) primary key, is indexable. Do a mapping into a "create table" should be easy.
I have mixed feelings about abstraction layers. I am not opposed to abstracting DDL (table creations, alters, ...etc.), but have mixed feelings on full abstraction of DML (select, insert, ...etc.), on one hand it makes development easier, but they
Frankly, I share this feelings too.
tend to be lowest common denominator and failing to optimize to the specific engine in use.
Mhh, using PDO you have two choices: always maintain multiple backends (and PDO is useless), or simplify the queries to the lowest common denominator... In our case, using an abstraction layer (modeled pretty close to the tables structure) should make possible to have some some sort of "source level optimization). As an example, somewhere in the code there are some "select distinct b.* from ...": manipulating this query is a nightmare! To be compatible with Oracle I need to "unpack" the field list from the "b.*" and apply the "distinct" only to some of them. If the table was created as an object (and the application can simply access the fields list), and if I have a way to (simply) ask "fetch all - distinct - lines from that table" my job can be really simpler. From a more technical point of view, I simply hate the abstraction layers when they are too rigid ("struts" mean anything for you?). The ability to make dynamic changing objects in php is - from my experience - the best starting point.
However, we already have an abstraction layer in Drupal that has worked well. The drawback is that it is not a standard though. If we are to write a new layer, then it has to be something with wide support for various databases, and PDO seems to be that layer.
About the db_num_rows question: there are more than a good reason to avoid it. As an example, the default behavior for the postgresql driver is to fetch in memory _all_ the results before returning the first one to the caller: with big tables this is simply devastating (and can be removed from future releases). And Oracle simply doesn't support it.
Many databases do not support, but for those that do, it avoids the performance penalty of two queries (one SELECT col1, col2 ...., and one SELECT COUNT(*)).
In your other message, you mentioned that db_num_rows is bad for performance. How is that so? Is this true only for PostgreSQL?
As I said, when available, the db_num_rows depends on fetching in memory _all_ rows from the database: this is a big performance cost (fetching many data from disk to database memory, from database memory to network, from network to apache/php memory). Even worse if these data aren't really needed. Only mysql seems to have a more efficient way to do so. but...
-- 2bits.com http://2bits.com Drupal development, customization and consulting.
Best Regards, Gabriele
participants (3)
-
Gabriele Turchi -
Khalid Baheyeldin -
Larry Garfield