[development] Database / SQL future thoughts

Bertrand Mansion drupal at mamasam.net
Tue May 5 20:12:44 UTC 2009


On Tue, May 5, 2009 at 9:30 PM, Jeff Eaton <jeff at viapositiva.net> wrote:
> On May 5, 2009, at 1:54 PM, Bertrand Mansion wrote:
>
>> Well, I think I know everything there is to know about Drupal. I have
>> been developing modules for it for 4 years now and deployed a dozen of
>> websites for customers, some quite large... I think it is a good
>> opportunity now, with the emergence of these new databases, to think
>> about what we have been doing for years, and how.
>>
>> Instead of being arrogant and underestimating others, you should start
>> by asking yourself if there really isn't any other way to better
>> manage tags (and cache, and sessions, and hierarchies, and callbacks,
>> and file storage, etc).
>
> You're quite right, Bertrand, and I apologize for the snarkiness of my
> comment. Since that article you pointed to described Drupal's tagging system
> *without* suggesting it was a fundamentally flawed approach, I assumed you
> were not familiar with the Taxonomy system's internals. This is not a matter
> of arrogance but of misinterpreting your statement. As I'm sure you know
> from being on the devel list, there is an unending stream of "Drupal Should
> Do X Like Y, And Here's A Blog Post To Prove It" comments that are not
> necessarily rooted in familiarity with the way the system already works.
>
> Greg's statement, though, stands: Taxonomy as it presently stands is a
> generalized metadata system, and the optimizations discussed in the first
> two parts of the article you linked to are not possible without building an
> entirely different set of specialized systems. The third model, explained in
> the article that you linked to, is what Drupal uses currently.
>
> A number of other developers have suggested that other approaches might be
> good -- rather than tying ourselves to a relational model, we should
> consider treating nodes as cached objects, for example. Doing so would
> probably yield some great improvements for the specific use cases we
> optimize the storage mechanism for. I could be wrong, but at present the use
> of a traditional SQL backend is still our best bet for a generalized system
> that allows users to design their schemas and their views in an ad-hoc
> fashion without writing code.
>
> Is there any way that something other than SQL could leverage multiple
> loosely connected systems like CCK, Taxonomy, and Views without crippling
> performance in other areas? That's not a rhetorical question; I'm curious
> and would like to know if I'm overlooking some fundamental issues.

This is exactly what I am investigating for another project where I
use Tokyo Tyrant with PHP. I don't have figures yet nor concrete
solutions, but I find it very interesting and challenging to try to
think differently. In an application I wrote, I chose to manage my
tags differently (only one table with lots of redundancy but fast),
and it worked well. In another application, I also tried another way
to deal with child/parent relations (not the celko's way but using
LIKE, depths and paths) and it also worked well, was easier to manage
and faster.

With these new databases, at first, I found it very difficult to
forget everything about relational DBs and ORM like solutions where a
table looks like an object. I am almost sure that Drupal, like any
other CMS, could take advantage from systems like CouchDB or Tokyo
Cabinet (or others). Take for example the 'node' and 'node_revision'
tables, in such DBs they wouldn't need to be separate entities.
CouchDB has versioning. Tokyo Cabinet can compress your data on the
fly so you can store many versions of your node without having to
worry about relations.

For CCK, it wouldn't even be needed because Tokyo Cabinet tables, like
CouchDB's, can have arbitrary number of fields. It is your application
which decides which fields are required, not your database.

I find it quite exciting, you should see for yourself.
PS: I'll release a PHP class that talks with Tokyo Tyrant soon,
probably in PEAR.

-- 
Bertrand Mansion
Mamasam


More information about the development mailing list