Long database updates
Occasionally we need a database update which iterated over each node individually. Since these are a least O(n) operations, we can't trust them to be fast because Drupal runs some big sites. At the same time we can't rely on our ability to change PHP's timeout since safe mode might be on. We don't want to take a long time to do the update without sending feedback to the user since the server churning away looks the same as the server hanging. We have two long updates. The forthcoming revisions update promises to be a resource hog. Update 124 needs to iterate every comment. The way to handle these long updates is to use db_query_range() for the initial SELECT for the list of nodes or comments to be processed. This prevents long updates from running too long in one request. A quick primer on how update.php will work in the near future [1]: 1. Once the version selection form is submitted a list of every hook_update_N() which needs to be called is compiled and stored as an array in the session. 2. The first update progress page is served. With JavaScript-capable browsers this is an AJAX progress bar. Otherwise, it is a page with a meta-refresh. 3. The update progress pages are now requested automatically from JavaScript or meta-refresh. The array of functions to be called is treated as a queue. A functions are dequeued and run until the time limit of 1 second is reached. The time limit is so low to provide plenty of user feedback. 4. Once the queue is empty, the browser is redirected to the finished page. The current $from argument would be stored by update.php in the session with the queue if a long update does not finish. The function would be left at the top of the queue until done. I see this patch as something to be written separately this weekend once the current update.php patch [1] is applied. [1] See the patch at http://drupal.org/node/35924. -- Neil Drumm http://delocalizedham.com/
On Fri, 2 Dec 2005, Neil Drumm wrote:
Occasionally we need a database update which iterated over each node individually. Since these are a least O(n) operations, we can't trust them to be fast because Drupal runs some big sites. At the same time we can't rely on our ability to change PHP's timeout since safe mode might be on. We don't want to take a long time to do the update without sending feedback to the user since the server churning away looks the same as the server hanging.
In such cases it is worthwhile to investigate if it is possible to offload some of the load to the SQL server as I have done for the first revisions update.
We have two long updates. The forthcoming revisions update promises to be a resource hog.
It is not /that/ bad. For example Drupal.org recently only had about 750 nodes with revisions.
Update 124 needs to iterate every comment.
No idea what it does, but maybe SQL can come to the rescue. Cheers, Gerhard
participants (2)
-
Gerhard Killesreiter -
Neil Drumm