[development] Unique/Random IDs and drupal

Ethan Fremen ethan at acquia.com
Sun Aug 10 21:47:34 UTC 2008


On Aug 10, 2008, at 4:57 PM, Larry Garfield wrote:
> There was discussion of including foreign keys in Schema API in  
> Drupal 6, but
> it was dropped after we determined that we couldn't actually do  
> anything with
> that information at the time.  We have to still support MySQL MyISAM  
> tables,
> which are far and away the most common, and those don't support  
> foreign keys.

I don't actually care about schema supporting foreign keys. I just  
want it that when someone puts a reference to a foreign key in their  
schema, they do so using a type that is distinct from 'int'.

> Serial fields are actually very useful, and there's nothing wrong  
> with them.

There are several things wrong with them:

a.) it makes it impossible to shard the DB because you have to  
coordinate what the next sequence ID is, a huge problem with scaling.
b.) it makes it difficult to merge two (or more) existing DBs, whether  
development/production or otherwise.
c.) it means that every drupal site has namespace collisions all over  
the place with every other drupal site.

> In fact, they are a requirement if we want even remotely  
> intelligible URLs.

If by "intelligible" you mean "a small number" then you are correct.

> nids, uids, tids, etc. are all used in URLs, and most Drupal nodes  
> do not in
> fact have a URL alias on them AFAIK.

I was under the impression that pathauto was in wide use.

>  If we add some form of GUID to the
> system (which I am not against, and Greg Dunlap has made good  
> arguments for)
> it will have to be in addition to existing serial fields.

Which will not, in fact, address any of the issues I've outlined above.

> We also can't rely on MySQL 5.1 at this point, as it's not even  
> fully stable
> to say nothing of widely deployed.

I don't disagree. I was sharing the state of my understanding of DB  
support for these sorts of things. I think it is best for drupal to  
generate them itself at the moment.

> Given that, I don't see much advantage for 99% of sites to using 64- 
> bit unique
> IDs over 32-bit.

Use cases for the 99% of sites using a 64 bit unique identifier:
a.) Their content is globally unique across the set of Drupal sites.  
This makes many syndication and federation tasks easier.
b.) They ever wish to join with another site selected at random.

>  It wouldn't break anything I suppose, but how many Drupal
> sites have the multiple millions of nodes required to run out of the  
> 32-bit
> space?  I can't actually think of any.

The set of Drupal sites does. This is like the "we'll never run out of  
a 32 bit identifier" idea with ipv4.

Plus, there's exactly 0 performance difference when using a 64 bit  
machine.

I hope this helps clarify some of the use cases,

~ethan fremen



More information about the development mailing list