[development] Database / SQL future thoughts

Gerhard Killesreiter gerhard at killesreiter.de
Tue May 5 16:08:33 UTC 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bertrand Mansion schrieb:

> On Tue, May 5, 2009 at 4:18 PM, Greg Knaddison
> <Greg at growingventuresolutions.com> wrote:
>> This is some interesting feedback.  I'd love to hear more of your
>> thoughts so I can understand it better.
>>
>> On Tue, May 5, 2009 at 7:51 AM, Bertrand Mansion <drupal at mamasam.net> wrote:
>>> Drupal doesn't really need a relational DB and actually doesn't use
>>> the relational features properly (for example, the way tags are stored
>>> is not efficient). That's one of the reasons why it is slow and
>>> doesn't scale very well.
>> These are some pretty sweeping claims.  Can you expand on them?
> 
> These are not claims, but only my opinion, based on my experiences
> with Drupal.


With which types of sites did you make these experiences?

>> 1. Tags are stored inefficiently (I can't think of a way to store
>> them that is better for every use)
> 
> What do you mean by "every use" ?

He probably means "every possible use-case".

> If you are interested in reading about tags and SQL, here are some pointers:
> http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html
> http://laughingmeme.org/2005/04/07/in-lieu-of-the-promised-article-on-tags-and-sql/
> 
> The way Drupal stores tags hierarchy is also inefficient.

We just agree to some terminology. In the Drupalverse, we have

1) a term, which is part of a vocabulary. The vocabulary can be
hierarchical or not, and allow some other special cases.

2) a tag, this is a special case of a term, namely a term from a
vocabulary that has the "free tagging" bit switched on. This also
means the vocabulary has no hierarchy.

Drupal 6 is in particular problematic if you have

a) a lot of nodes
b) which have several terms each (doesn't matter which type)
and
c) have many revisions.

In Drupal 5 terms where not revisioned, in D6 they are and this causes
mySQL to sift through far more entries than it had to before.

drupal.org's handbooks and projects are use cases for this.

>> 2. Drupal doesn't scale very well (I've seen enough claims otherwise -
>> is there a particular problem you can point to?)
> 
> Lots of problems for me, but that's off topic.

No, we are on the development list where can discuss such topics.

> Sessions, cache, sql queries, table structures, file storage,
> architecture and callback hell, etc.

You should probably open new threads for each topic in order no not
clutter this one too much.

> It's currently the best CMS for PHP I know, but still, it is
> sluggish (yes, this is subjective).

Very.

>>> But going for another storage system would be better if implemented as
>>> a fork, IMO.
>> Various people just rewrote the entire DB API, so it is possible to
>> make massive API changes within a release cycle.  Why do you feel a
>> fork is necessary?
> 
> Moving to something like CouchDB as was suggested in the thread or

Somebody (David?) brought up the topic of CouchDB, but I am not aware
of a serious development effort.

> some other datastores (someone mentioned Hadoop but I think he meant
> BigTable, Dynamo or Tokyo Cabinet...) would need more than just
> rewriting the DB API in my opinion. That's why I can only imagine
> this as a fork. That would be my reply to the first post.


I don't think that a Drupal fork would be viable.

Cheers,
	Gerhard


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkoAZIEACgkQfg6TFvELooQ/uwCfcQT8jlJzYQLgtHfB7XQha5KC
racAoIrjNmAuejqz1D5m6w5cbNpZuJ0u
=w+4e
-----END PGP SIGNATURE-----


More information about the development mailing list