On Sunday 10 August 2008 2:49:58 pm Ethan Fremen wrote:
Gentlebeings,
I've read the recent thread on devel->staging->deployment, and I wanted to share what I've done in the area.
My main interest lies in moving away from monotonically incrementing integers as id values so I have a greater chance of being able to "shard" the drupal DB for high performance.
As a first step, I am working on moving drupal core to using 64 bit integers. It was relatively trivial to change schema to create 64 bit tables, but right now there's nothing in schema that marks "foreign keys" as also being special.
Would anyone object to schema requiring foreign ID references to be marked specially?
Anyway, the direction I'm interested in heading is a 64 bit int + creation date timestamp for every row; the two can be combined to form a valid UUID if there are useful reasons for doing so.
Note that Postgresql has UUID creating functions and MySQL 5.1 has uuid_short() which generates a 64 bit random int based on the UUID algorithm. The UUID() algorithm in mysql 5.0 isn't viable for scaling purposes because it's not cluster-safe.
I do have some preliminary performance data on the speed at which one can create UUIDs: http://mindlace.net/archives/2008/06/23/generating-uuids-in-php-for-drupal/
I'm very interested in any feedback about the feasibility of at least widening the ids to 64 bits in D7.
~ethan
There was discussion of including foreign keys in Schema API in Drupal 6, but it was dropped after we determined that we couldn't actually do anything with that information at the time. We have to still support MySQL MyISAM tables, which are far and away the most common, and those don't support foreign keys. (Therefore we can't rely on integrity checking, cascading delete/update, etc.) I believe there was consideration of adding FK support in Drupal 7 to allow add-on functionality in places, and I'm certainly open to doing so, but not until the existing 300 KB database overhaul patch has landed. :-) Serial fields are actually very useful, and there's nothing wrong with them. In fact, they are a requirement if we want even remotely intelligible URLs. nids, uids, tids, etc. are all used in URLs, and most Drupal nodes do not in fact have a URL alias on them AFAIK. If we add some form of GUID to the system (which I am not against, and Greg Dunlap has made good arguments for) it will have to be in addition to existing serial fields. We also can't rely on MySQL 5.1 at this point, as it's not even fully stable to say nothing of widely deployed. Given that, I don't see much advantage for 99% of sites to using 64-bit unique IDs over 32-bit. It wouldn't break anything I suppose, but how many Drupal sites have the multiple millions of nodes required to run out of the 32-bit space? I can't actually think of any. -- Larry Garfield larry@garfieldtech.com