On May 5, 2009, at 1:54 PM, Bertrand Mansion wrote:
Well, I think I know everything there is to know about Drupal. I have been developing modules for it for 4 years now and deployed a dozen of websites for customers, some quite large... I think it is a good opportunity now, with the emergence of these new databases, to think about what we have been doing for years, and how.
Instead of being arrogant and underestimating others, you should start by asking yourself if there really isn't any other way to better manage tags (and cache, and sessions, and hierarchies, and callbacks, and file storage, etc).
You're quite right, Bertrand, and I apologize for the snarkiness of my comment. Since that article you pointed to described Drupal's tagging system *without* suggesting it was a fundamentally flawed approach, I assumed you were not familiar with the Taxonomy system's internals. This is not a matter of arrogance but of misinterpreting your statement. As I'm sure you know from being on the devel list, there is an unending stream of "Drupal Should Do X Like Y, And Here's A Blog Post To Prove It" comments that are not necessarily rooted in familiarity with the way the system already works. Greg's statement, though, stands: Taxonomy as it presently stands is a generalized metadata system, and the optimizations discussed in the first two parts of the article you linked to are not possible without building an entirely different set of specialized systems. The third model, explained in the article that you linked to, is what Drupal uses currently. A number of other developers have suggested that other approaches might be good -- rather than tying ourselves to a relational model, we should consider treating nodes as cached objects, for example. Doing so would probably yield some great improvements for the specific use cases we optimize the storage mechanism for. I could be wrong, but at present the use of a traditional SQL backend is still our best bet for a generalized system that allows users to design their schemas and their views in an ad-hoc fashion without writing code. Is there any way that something other than SQL could leverage multiple loosely connected systems like CCK, Taxonomy, and Views without crippling performance in other areas? That's not a rhetorical question; I'm curious and would like to know if I'm overlooking some fundamental issues. --Jeff Eaton