[drupal-devel] LOCK vs. SELECT FOR UPDATE
Hi there, until now we've largely ignored people who aren't allowed to lock their tables and thus cannot use Drupal. Somebody suggested (http://drupal.org/node/1190#comment-55984) to use SELECT FOR UPDATE in database.mysql.inc instead. Is there any database expert who can comment on the pros and cons of the approach? Cheers, Gerhard
until now we've largely ignored people who aren't allowed to lock their tables and thus cannot use Drupal. Somebody suggested (http://drupal.org/node/1190#comment-55984) to use SELECT FOR UPDATE in database.mysql.inc instead. Is there any database expert who can comment on the pros and cons of the approach?
I remember we discuss this a few monthes ago see http://drupal.org/node/21439 The problem is that you need MySQL 4 and InnoDb tables to use this.
There are many cases where MySQL supports the syntax but not the functionality. It silently accepts what you enter (whether it is data definition or queries) but it does not do what you think it should do. Most of this has to do with using MyISAM tables but features that only in InnoDB. See here for details http://sql-info.de/mysql/gotchas.html On 9/27/05, Nicolas Tostin <nicolast@logis.com.mx> wrote:
until now we've largely ignored people who aren't allowed to lock their tables and thus cannot use Drupal. Somebody suggested (http://drupal.org/node/1190#comment-55984) to use SELECT FOR UPDATE in database.mysql.inc instead. Is there any database expert who can comment on the pros and cons of the approach?
I remember we discuss this a few monthes ago see http://drupal.org/node/21439 The problem is that you need MySQL 4 and InnoDb tables to use this.
Speaking of which, are there plans to drop MySQL 3.x support and go with 4.x which has been the MySQL standard for quite sometime now (2 years IIRC)? Drupal could certainly utillize this to it's advantage with better optimized queries and less database overhead functions. ted (aka m3avrck)
On 27 Sep 2005, at 18:18, Theodore Serbinski wrote:
Drupal could certainly utillize this to it's advantage with better optimized queries and less database overhead functions.
Mind to be a bit more specific? -- Dries Buytaert :: http://www.buytaert.net/
On 9/28/05, Dries Buytaert <dries.buytaert@gmail.com> wrote:
On 27 Sep 2005, at 18:18, Theodore Serbinski wrote:
Drupal could certainly utillize this to it's advantage with better optimized queries and less database overhead functions.
Mind to be a bit more specific?
Well first off, foreign key support. I know this was introduced in one of the latter 3 versions, 3.23.44, to be exact, but it wasn't until version 4 that most of the foreign key constraints were properly working and all of the bugs squashed. I'd love to see Drupal move to an InnoDB table storage adding foreign key constraints. This would add a tremendous amount of data integrity to Drupal and we could eliminate some overhead. For example, there was just a thread about a user being deleted and then all of the forum posts being deleted as well. If we had things setup with foreign keys, we could something like "ON DELETE SET NULL" in which case all of these posts would be associated with a null user... we could even maybe associate these posts with UID=1 but that is another issue to discuss. As a result of this ON DELETE, ON UPDATE mechanisms, we could delete any other queries in Drupal that clean out other tables as certain rows are delete. Not to mention if a user in phpMyAdmin tries to delete a table, foreign keys can prevent many (but not all) of these tables from being deleted because of foreign key constraints. This can also help future modifications from accidently breaking the database in Drupal as well. More info: http://dev.mysql.com/doc/mysql/en/ansi-diff-foreign-keys.html MyISAM is really best *only* for INSERTs and SELECTs if you read carefully above. Sure there are some tradeoffs but I would like to see this. Additionally, if we move to 4.1 (which is the recommend release now but might be a bit too far ahead now) we could make use of subselects. Once instance came up with the new upload admin consolidation ( http://drupal.org/node/25756 which I wonder why no reviews yet? Patch has been in since the freeze a while back). To perform some of the aggregate queries, subselects would have been better instead of running a single query, that looping through in PHP to get the aggregate results. Now the initial benefits of moving to supporting 4.0 are minimal, however, I think we could massage Drupal a bit into better using some of these advanced features and I'm sure we could cut some bloat and simplify things elsewhere. I haven't looked into this too detailed yet, but I would like to keep this on the horizon. ted (aka m3avrck)
Well first off, foreign key support.
This is going to be hard to do and maintain on an ongoing basis considering our database independence feature. It may not be impossible though. If you or others want to publish an exploratory patch, it will move the conversation along. Until a patch arises, I don't see a point in discussing a drop of mysql 3.x As for Gerhard's original question, it would certainly be nice to get rid of the requirement on LOCK TABLES permission. That has bit me before shared web hosts. Maybe we should use different locking technique depending on the table type or permission level of the DB user.
Moshe Weitzman wrote:
As for Gerhard's original question, it would certainly be nice to get rid of the requirement on LOCK TABLES permission. That has bit me before shared web hosts. Maybe we should use different locking technique depending on the table type or permission level of the DB user.
That bit me on a managed server (strato.de) One would think that if you've got the entire server, managed or not, that you could count on these features being there. I had to move the project to a root server because of LOCK TABLES. -Robert
On Wed, 28 Sep 2005, Robert Douglass wrote:
Moshe Weitzman wrote:
As for Gerhard's original question, it would certainly be nice to get rid of the requirement on LOCK TABLES permission. That has bit me before shared web hosts. Maybe we should use different locking technique depending on the table type or permission level of the DB user.
That bit me on a managed server (strato.de) One would think that if you've got the entire server, managed or not, that you could count on these features being there. I had to move the project to a root server because of LOCK TABLES.
I've searched a bit about when and why we introduced locking in the first place: http://lists.drupal.org/archives/drupal-devel/2002-08/msg00222.html http://lists.drupal.org/archives/drupal-devel/2002-09/msg00087.html http://lists.drupal.org/archives/drupal-devel/2002-08/msg00243.html It still does no explain why we need(ed?) to have the ID before the commit. Does anybody recall this? Cheers, Gerhard
I can't comment on this history, but I think I understand why the LOCK/REPLACE/UNLOCK was done. The reason is to insure that the sequence you got has not been updated by someone else, hence the REPLACE was enclosed in a LOCK/UNLOCK. All this is not needed if we use AUTOINCREMENT fields, which as someone else said, is available in the two supported databases, and even in MS SQL should someone need it in the future. This leaves only two other places where the tables are locked (one is variable_set() and the other is cache_set() which are in bootstrap.inc). On 9/28/05, Gerhard Killesreiter <killesreiter@physik.uni-freiburg.de> wrote:
On Wed, 28 Sep 2005, Robert Douglass wrote:
Moshe Weitzman wrote:
As for Gerhard's original question, it would certainly be nice to get rid of the requirement on LOCK TABLES permission. That has bit me before shared web hosts. Maybe we should use different locking technique depending on the table type or permission level of the DB user.
That bit me on a managed server (strato.de) One would think that if you've got the entire server, managed or not, that you could count on these features being there. I had to move the project to a root server because of LOCK TABLES.
I've searched a bit about when and why we introduced locking in the first place:
http://lists.drupal.org/archives/drupal-devel/2002-08/msg00222.html http://lists.drupal.org/archives/drupal-devel/2002-09/msg00087.html http://lists.drupal.org/archives/drupal-devel/2002-08/msg00243.html
It still does no explain why we need(ed?) to have the ID before the commit. Does anybody recall this?
Cheers, Gerhard
I think we should use autoincrement/serial and have a db_last_id function (mysql_insert_id, select currval, lastInsertId in PDO etc.). This would make Drupal code actually shorter and simpler (a very little bit).
On Thu, 29 Sep 2005, Karoly Negyesi wrote:
I think we should use autoincrement/serial and have a db_last_id function (mysql_insert_id, select currval, lastInsertId in PDO etc.). This would make Drupal code actually shorter and simpler (a very little bit).
I think that I do now recall why we are looking up the next ID first: If we use mysql_insert_id we risk not getting the right ID. See here for some warnings: http://de.php.net/manual/en/function.mysql-insert-id.php Cheers, Gerhard
On Thu, Sep 29, 2005 at 02:12:20AM +0200, Gerhard Killesreiter wrote:
On Thu, 29 Sep 2005, Karoly Negyesi wrote:
I think we should use autoincrement/serial and have a db_last_id function (mysql_insert_id, select currval, lastInsertId in PDO etc.). This would make Drupal code actually shorter and simpler (a very little bit).
I think that I do now recall why we are looking up the next ID first: If we use mysql_insert_id we risk not getting the right ID.
See here for some warnings: http://de.php.net/manual/en/function.mysql-insert-id.php
We do not: http://dev.mysql.com/doc/mysql/en/getting-unique-id.html http://dev.mysql.com/doc/mysql/en/information-functions.html -- Piotrek irc: #debian.pl Mors Drosophilis melanogastribus!
I think that I do now recall why we are looking up the next ID first: If we use mysql_insert_id we risk not getting the right ID.
Per mysql doc: "For LAST_INSERT_ID(), the most recently generated ID is maintained in the server on a per-connection basis. It is not changed by another client." I fail to see how could we no get the right ID.
Karoly Negyesi wrote:
I think that I do now recall why we are looking up the next ID first: If we use mysql_insert_id we risk not getting the right ID.
Per mysql doc:
"For LAST_INSERT_ID(), the most recently generated ID is maintained in the server on a per-connection basis. It is not changed by another client."
I fail to see how could we no get the right ID.
Do successive requests from the same web server reuse the same connection?
On Thu, 29 Sep 2005 13:39:03 +0200, Moshe Weitzman <weitzman@tejasa.com> wrote:
Karoly Negyesi wrote:
I think that I do now recall why we are looking up the next ID first: If we use mysql_insert_id we risk not getting the right ID. Per mysql doc: "For LAST_INSERT_ID(), the most recently generated ID is maintained in the server on a per-connection basis. It is not changed by another client." I fail to see how could we no get the right ID.
Do successive requests from the same web server reuse the same connection?
End of PHP script closes connection. If you have the knowledge to change DB layer to use pconnect then you certainly can (re)introduce LOCKing.
Do successive requests from the same web server reuse the same connection?
End of PHP script closes connection. If you have the knowledge to change DB layer to use pconnect then you certainly can (re) introduce LOCKing.
Someone with more knowledge of threaded MPM's and/or alternate platforms would have to pipe in on this, but with prefork apache, the connection is attached to a httpd child. Without pconnect, the connection closes at the end of the script's execution. With pconnect, the connection stays open for use by a subsequent request using that process. Subsequent != concurrent, so you shouldn't be seeing any race conditions or multiple processes using a connection at the same time. Allie Micka pajunas interactive, inc. http://www.pajunas.com scalable web hosting and open source strategies
Without pconnect, the connection closes at the end of the script's execution. With pconnect, the connection stays open for use by a subsequent request using that process. Subsequent != concurrent, so you shouldn't be seeing any race conditions or multiple processes using a connection at the same time.
OK then. Lets ditch those home grown sequences and make use of auto-increment/serial/identity fields.
If connections are on a per client, then I am with Moshe on this. Let us do it The Right Way (TM). On 9/29/05, Moshe Weitzman <weitzman@tejasa.com> wrote:
Without pconnect, the connection closes at the end of the script's execution. With pconnect, the connection stays open for use by a subsequent request using that process. Subsequent != concurrent, so you shouldn't be seeing any race conditions or multiple processes using a connection at the same time.
OK then. Lets ditch those home grown sequences and make use of auto-increment/serial/identity fields.
Khalid B wrote:
If connections are on a per client, then I am with Moshe on this.
Let us do it The Right Way (TM).
thanks for the kindness. please excuse my brief rant aimed at noone in particular ... there is nothing wrong with our home grown sequences. we currently support RDMS that don't supply sequences of their own, or those that can't reliably return a LAST_INSERT_ID. There is no single "Right Way" to do sequences; these are engineering tradeoffs. I don't think it is helpful to frame engineering decisions this way. I feel similarly when people describe eval() as 'evil' or javascript as 'evil'. broad comments like these do not advance the conversation.
At 5:26 PM +0200 27/9/05, Gerhard Killesreiter wrote:
until now we've largely ignored people who aren't allowed to lock their tables and thus cannot use Drupal.
Once upon a time I wrote some code which simulates table locks using a semaphore system. It's in here if anyone's interested: http://cvs.sourceforge.net/viewcvs.py/phplib/php-lib-stable/php/db_odbc.inc ...R.
participants (11)
-
Allie Micka -
Dries Buytaert -
Gerhard Killesreiter -
Karoly Negyesi -
Khalid B -
Moshe Weitzman -
Nicolas Tostin -
piotrwwwï¼ krukowiecki.net -
Richard Archer -
Robert Douglass -
Theodore Serbinski