[development] Unique/Random IDs and drupal

Larry Garfield larry at garfieldtech.com
Sun Aug 10 20:57:27 UTC 2008

On Sunday 10 August 2008 2:49:58 pm Ethan Fremen wrote:
> Gentlebeings,
> I've read the recent thread on devel->staging->deployment, and I
> wanted to share what I've done in the area.
> My main interest lies in moving away from monotonically incrementing
> integers as id values so I have a greater chance of being able to
> "shard" the drupal DB for high performance.
> As a first step, I am working on moving drupal core to using 64 bit
> integers. It was relatively trivial to change schema to create 64 bit
> tables, but right now there's nothing in schema that marks "foreign
> keys" as also being special.
> Would anyone object to schema requiring foreign ID references to be
> marked specially?
> Anyway, the direction I'm interested in heading is a 64 bit int +
> creation date timestamp for every row; the two can be combined to form
> a valid UUID if there are useful reasons for doing so.
> Note that Postgresql has UUID creating functions and MySQL 5.1 has
> uuid_short() which generates a 64 bit random int based on the UUID
> algorithm.  The UUID() algorithm in mysql 5.0 isn't viable for scaling
> purposes because it's not cluster-safe.
> I do have some preliminary performance data on the speed at which one
> can create UUIDs:
> http://mindlace.net/archives/2008/06/23/generating-uuids-in-php-for-drupal/
> I'm very interested in any feedback about the feasibility of at least
> widening the ids to 64 bits in D7.
> ~ethan

There was discussion of including foreign keys in Schema API in Drupal 6, but 
it was dropped after we determined that we couldn't actually do anything with 
that information at the time.  We have to still support MySQL MyISAM tables, 
which are far and away the most common, and those don't support foreign keys.  
(Therefore we can't rely on integrity checking, cascading delete/update, 
etc.)  I believe there was consideration of adding FK support in Drupal 7 to 
allow add-on functionality in places, and I'm certainly open to doing so, but 
not until the existing 300 KB database overhaul patch has landed. :-)

Serial fields are actually very useful, and there's nothing wrong with them.  
In fact, they are a requirement if we want even remotely intelligible URLs.  
nids, uids, tids, etc. are all used in URLs, and most Drupal nodes do not in 
fact have a URL alias on them AFAIK.  If we add some form of GUID to the 
system (which I am not against, and Greg Dunlap has made good arguments for) 
it will have to be in addition to existing serial fields.

We also can't rely on MySQL 5.1 at this point, as it's not even fully stable 
to say nothing of widely deployed.

Given that, I don't see much advantage for 99% of sites to using 64-bit unique 
IDs over 32-bit.  It wouldn't break anything I suppose, but how many Drupal 
sites have the multiple millions of nodes required to run out of the 32-bit 
space?  I can't actually think of any.  

Larry Garfield
larry at garfieldtech.com

More information about the development mailing list