[development] Database / SQL future thoughts

Bertrand Mansion drupal at mamasam.net
Tue May 5 18:54:23 UTC 2009

On Tue, May 5, 2009 at 5:40 PM, Jeff Eaton <jeff at viapositiva.net> wrote:
> On May 5, 2009, at 10:07 AM, Bertrand Mansion wrote:
> 1. Tags are stored inefficiently (I can't think of a way to store them
> that is better for every use)
> What do you mean by "every use" ?
> If you are interested in reading about tags and SQL, here are some pointers:
> http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html
> http://laughingmeme.org/2005/04/07/in-lieu-of-the-promised-article-on-tags-and-sql/
> When Greg says, "Every use" he means that Drupal allows tags and taxonomy
> terms in general to be used for a lot of metadata related purposes. It's
> possible to optimize the storage and retrieval mechanism for specific use
> cases (like user-specific flickr-style tags, or hierarchical
> categorization), but the optimizations for those use cases are at odds with
> each other. What works for one will make the others punishingly inefficient.
> Thus, Drupal currently uses a 'best-compromise' schema that allows it to
> capture as much information as possible (hierarchy, weight, association,
> etc.) and relies on caching at a later point to smooth out hot spots.
> There may well be further improvements that can be eked out, and there may
> be opportunities for optimization that have been missed -- and there may
> even be a case to be made for splitting taxonomy into real "tags" and
> "hierarchical category" so that the system can be better optimized. But I'm
> not sure that you're really clear on how Drupal actually works under the
> hood; the article you pointed to explicitly described Drupal's tag storage
> schema and outlined its advantages. If you go back and read the article,
> it's the "Toxi Solution."

Well, I think I know everything there is to know about Drupal. I have
been developing modules for it for 4 years now and deployed a dozen of
websites for customers, some quite large... I think it is a good
opportunity now, with the emergence of these new databases, to think
about what we have been doing for years, and how.

Instead of being arrogant and underestimating others, you should start
by asking yourself if there really isn't any other way to better
manage tags (and cache, and sessions, and hierarchies, and callbacks,
and file storage, etc).

Bertrand Mansion

More information about the development mailing list