[development] Why do we avoid auto-incrementing db columns?

Earnest Berry III earnest.berry at gmail.com
Wed Apr 25 17:59:45 UTC 2007


Hi Sammy,

There's only on thing I disagree on; the performance hit of the mutex lock.
If you do not have a relatively large number of insertions, than this is not
a large performance hit; however, if you do, this does become a performance
hit. I admit, I'm on the extreme end of the spectrum as I was inserting some
20 million nodes (which, expanded means 20 million nodes, 20 million
node_revisions, and 7-14 million other insertions in diff. tables for the
application). 

After implementing random-id generation with out locking the table ever at
all, the performance increase was very noticeable, on the order to 70-100%
more throughput of node creation.

I would think removing the bottle neck though would benefit sites that
probably  have a lot of comments being added and content, as there is no
contention for the sequences table anymore.



-----Original Message-----
From: development-bounces at drupal.org [mailto:development-bounces at drupal.org]
On Behalf Of Sammy Spets
Sent: Wednesday, April 25, 2007 12:54 PM
To: development at drupal.org
Subject: Re: [development] Why do we avoid auto-incrementing db columns?

Hi everyone,

I'm PostgreSQL maintainer for Drupal and since people are up in the air
about what is going on down in the DB layer re autoincrement and serial
fields I figured i'd shed some light. Many people have already stated
this stuff in earlier mails and hopefully getting the info in one big
pile will make it more concrete.

PostgreSQL does implement atomic sequencing with the serial and
bigserial types. MySQL does the same with the auto_increment field
property. These are used in Drupal to produce a guaranteed unique ID
for tables using them. Implementing this in PHP will certainly generate
non-unique IDs unless you lock tables thus introducing a massive
performance penalty.

We use db_next_id() in Drupal to provide a db-independent method of
getting the next unique ID into the application for use just after the
insert operation on the DB. The performance hit of using a sequence is
next to nothing considering all that's involved is a mutex lock, read,
increment and mutex unlock. Making this more efficient is simply not
possible.

Sure there are two different use cases for insertions: those where you 
need the ID afterwards (e.g node) and those you don't (e.g watchdog). 
If you want to write code without the use of db_next_id() then do an 
INSERT and omit the id column. I don't see anyone wanting to spend 
effort patching Drupal core to gain 0.01% efficiency across the 
application.

/me turns his torch off

-- 
Sammy Spets
Synerger Pty Ltd
http://synerger.com

On 25-Apr-07 10:15, Barry Jaspan wrote:
> 
> Could someone either clue me in or point to a prior discussion about
> why Drupal avoids using auto-incrementing database columns?  If it is
> database compatibility, which versions of which databases that we
> support do not have it?
> 
> Thanks,
> 
> Barry
> 



More information about the development mailing list