Database queries on drupal.org
Hi there! Since last night, I let the devel module collect query statistics on drupal.org. We now can see which queries are run very often and which are just slow. Here are the first queries by accumulated time. The first column is accumulated time, the second average time/query, the third the number of times the query was run, the fourth the function which calls the query, the last the query itself: 927.011 0.005 192603 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = S 905.582 0.001 973355 drupal_lookup_path SELECT dst FROM url_alias WHERE src = S 883.059 0.098 8986 theme_forum_topic_navigation SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = D 388.902 0.014 28330 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = D 186.896 2.967 63 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D 180.948 0.007 26472 cache_set INSERT INTO cache (cid, data, created, expire, headers) VALUES (S) 157.065 0.022 6983 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = S I am not sure why there are two node_load queries which look identical otherwise. Probably these fields need to be merged. Here are the first few queries by average query time, columns as above. 155.447 3.048 51 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN users u ON n.uid = u.uid LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 186.896 2.967 63 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D 94.058 1.844 51 pager_query SELECT COUNT(DISTINCT(n.nid)) FROM node n LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 109.691 1.155 95 pager_query SELECT DISTINCT m.*, r.*, u.name, u.uid FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT D 1.03 1.03 1 pager_query SELECT a.aid, a.timestamp, a.url, a.uid, u.name FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S ORDER BY a.timestamp DESC LIMIT D 2.536 0.845 3 comment_edit SELECT c.*, u.uid, u.name AS registered_name, u.data FROM comments c INNER JOIN users u ON c.uid = u.uid WHERE c.cid = D 0.614 0.614 1 eval select message, count(message) as count from watchdog where type = S and uid = D 0.434 0.434 1 aggregator_page_sources SELECT f.fid, f.title, f.description, f.image, MAX(i.timestamp) AS last FROM aggregator_feed f LEFT JOIN aggregator_item i ON f.fid = i.fid GROUP BY f.fid, f.title, f.description, f.image ORDER BY last DESC, f.title 0.429 0.429 1 pager_query SELECT q.qid, q.query, q.function, t.*, COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time) AS average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC LIMIT D The queries have been collected on every tenth page view, so you should multiply the count and total time values by ten for real values. Cheers, Gerhard
883.059 0.098 8986 theme_forum_topic_navigation SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = D
yikes. a big query inside a theme function. FIXME
186.896 2.967 63 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D
the DISTINCT looks unneeded here. also, my node_comment_statistics patch removes a JOIN for this - http://drupal.org/node/75122. not sure who is issueing this query ... we should have some special case for pager_query() so we know who really issued it.
2.536 0.845 3 comment_edit SELECT c.*, u.uid, u.name AS registered_name, u.data FROM comments c INNER JOIN users u ON c.uid = u.uid WHERE c.cid = D
we are missing an index on uid here. i think that would help. node has such an index
0.614 0.614 1 eval select message, count(message) as count from watchdog where type = S and uid = D
whats issueing this query? looks odd. our indexes are not setup for this
0.429 0.429 1 pager_query SELECT q.qid, q.query, q.function, t.*, COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time) AS average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC LIMIT D
an index on total_time might help here if people generally agree with these, i will submit patches.
Moshe Weitzman wrote:
186.896 2.967 63 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D
the DISTINCT looks unneeded here. also, my node_comment_statistics patch removes a JOIN for this - http://drupal.org/node/75122. not sure who is issueing this query ... we should have some special case for pager_query() so we know who really issued it.
The DISTINCT is probably being added by node_db_rewrite_sql() which likes to add DISTINCT to queries even when it's not rewriting them.
On Jul 26, 2006, at 4:45 AM, Gerhard Killesreiter wrote:
Since last night, I let the devel module collect query statistics on drupal.org. We now can see which queries are run very often and which are just slow.
Great! If I can get a copy of the Drupal.org schema then I'll write the EXPLAINs for both these queries and the Slow queries from the Drupal.org databases. We just need a PHP Snippet that puts an EXPLAIN in front of these queries and converts INSERTS, UPDATES, and DELETES to SELECTS to use EXPLAINs. I'll then ask for the snippet to be run on Drupal.org by a maintainer with PHP Input format privileges so I can review the EXPLAINS. It's also worth noting that Peter Zaitsev, the MySQL performance lead, has been explaining how MySQL EXPLAIN can be wrong here: http:// www.mysqlperformanceblog.com/2006/07/24/mysql-explain-limits-and-errors/ Cheers, Kieran
On 26 Jul 2006, at 18:10, Kieran Lal wrote:
Great! If I can get a copy of the Drupal.org schema then I'll write the EXPLAINs for both these queries and the Slow queries from the Drupal.org databases. We just need a PHP Snippet that puts an EXPLAIN in front of these queries and converts INSERTS, UPDATES, and DELETES to SELECTS to use EXPLAINs.
I'll then ask for the snippet to be run on Drupal.org by a maintainer with PHP Input format privileges so I can review the EXPLAINS. It's also worth noting that Peter Zaitsev, the MySQL performance lead, has been explaining how MySQL EXPLAIN can be wrong here: http://www.mysqlperformanceblog.com/2006/07/24/mysql- explain-limits-and-errors/
Isn't our time better spend building that functionality into the devel.module (along with a permission)? It would be convenient if there was an 'explain' link on the query overview page. -- Dries Buytaert :: http://www.buytaert.net/
In order to work with the Drupal API, is there a Drupal API Chart or Map that one can refer to? Regards ----------------------- Fouad -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.394 / Virus Database: 268.10.4/399 - Release Date: 7/25/2006
http://www.google.com/search?q=drupal+cheat+sheet Needs updating though... Gabor On Thu, 27 Jul 2006, Fouad Riaz Bajwa wrote:
In order to work with the Drupal API, is there a Drupal API Chart or Map that one can refer to?
Regards ----------------------- Fouad
-- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.394 / Virus Database: 268.10.4/399 - Release Date: 7/25/2006
On Jul 26, 2006, at 11:03 PM, Dries Buytaert wrote:
On 26 Jul 2006, at 18:10, Kieran Lal wrote:
Great! If I can get a copy of the Drupal.org schema then I'll write the EXPLAINs for both these queries and the Slow queries from the Drupal.org databases. We just need a PHP Snippet that puts an EXPLAIN in front of these queries and converts INSERTS, UPDATES, and DELETES to SELECTS to use EXPLAINs.
I'll then ask for the snippet to be run on Drupal.org by a maintainer with PHP Input format privileges so I can review the EXPLAINS. It's also worth noting that Peter Zaitsev, the MySQL performance lead, has been explaining how MySQL EXPLAIN can be wrong here: http://www.mysqlperformanceblog.com/2006/07/24/mysql- explain-limits-and-errors/
Isn't our time better spend building that functionality into the devel.module (along with a permission)? It would be convenient if there was an 'explain' link on the query overview page.
Unfortunately, no. mysqldump -d drupal_org_db is pretty fast and I can quickly write the explains through trial and error. We submitted a patch a few months ago to the devel module to allow the slow query log from MySQL to be displayed in the devel.module interface. It was decided at the time that configuring slow query logs was too difficult and not worth the effort to expose a log through the devel.module. So for starters we don't have the slowest queries to work on. Next, even if we did have the slow queries in devel.module converting queries into a format that works with Explain is not straight forward. The query dumped by the slow query log needs to be modified to work in explain. Sometimes it's variable replacement, some times it's clearing the syntax for EXPLAIN to work. Next we need to convert UPDATE, INSERT, DELETE to SELECTs. So I agree, building tools is useful for everyone, but quickly showing how we use the existing tools effectively is more useful. I suspect that you'll see the SLOW QUERY log accessible through queries in MySQL rather than a log file in the near future. Cheers, Kieran
-- Dries Buytaert :: http://www.buytaert.net/
Dries Buytaert wrote:
Isn't our time better spend building that functionality into the devel.module (along with a permission)? It would be convenient if there was an 'explain' link on the query overview page.
I actually have that running on my copy of devel module. I was going to AJAX it before committing it. -- Neil Drumm http://delocalizedham.com/
On Jul 27, 2006, at 10:32 AM, Neil Drumm wrote:
Dries Buytaert wrote:
Isn't our time better spend building that functionality into the devel.module (along with a permission)? It would be convenient if there was an 'explain' link on the query overview page.
I actually have that running on my copy of devel module. I was going to AJAX it before committing it.
You are parsing the slow query logs, or explaining every query, or explain queries above a certain threshold? Looking forward to seeing what you have. Kieran
-- Neil Drumm http://delocalizedham.com/
Kieran Lal wrote:
On Jul 27, 2006, at 10:32 AM, Neil Drumm wrote:
Dries Buytaert wrote:
Isn't our time better spend building that functionality into the devel.module (along with a permission)? It would be convenient if there was an 'explain' link on the query overview page.
I actually have that running on my copy of devel module. I was going to AJAX it before committing it.
You are parsing the slow query logs, or explaining every query, or explain queries above a certain threshold?
Looking forward to seeing what you have.
If you have the page query log showing, there is an additional explain column. The idea is that the explain results will be added below the query via AJAX. The issue queue is about 30 reviews behind where I'd like to see it, so I probably won't be getting to this myself soon. -- Neil Drumm http://delocalizedham.com/
On 26 Jul 2006, at 18:10, Kieran Lal wrote:
Great! If I can get a copy of the Drupal.org schema then I'll write the EXPLAINs for both these queries and the Slow queries from the Drupal.org databases. We just need a PHP Snippet that puts an EXPLAIN in front of these queries and converts INSERTS, UPDATES, and DELETES to SELECTS to use EXPLAINs.
I'll then ask for the snippet to be run on Drupal.org by a maintainer with PHP Input format privileges so I can review the EXPLAINS. It's also worth noting that Peter Zaitsev, the MySQL performance lead, has been explaining how MySQL EXPLAIN can be wrong here: http://www.mysqlperformanceblog.com/2006/07/24/mysql- explain-limits-and-errors/
Wouldn't it be possible to build that functionality into the devel.module (along with a permission)? It would be convenient if there was an 'explain' link on the query overview page. -- Dries Buytaert :: http://www.buytaert.net/
Gerhard Killesreiter wrote:
Since last night, I let the devel module collect query statistics on drupal.org. We now can see which queries are run very often and which are just slow.
I have now stopped the data collecting.
Here are the first queries by accumulated time. The first column is accumulated time, the second average time/query, the third the number of times the query was run, the fourth the function which calls the query, the last the query itself:
927.011 0.005 192603 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = S 905.582 0.001 973355 drupal_lookup_path SELECT dst FROM url_alias WHERE src = S 883.059 0.098 8986 theme_forum_topic_navigation SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = D 388.902 0.014 28330 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = D 186.896 2.967 63 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D 180.948 0.007 26472 cache_set INSERT INTO cache (cid, data, created, expire, headers) VALUES (S) 157.065 0.022 6983 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = S
I am not sure why there are two node_load queries which look identical otherwise. Probably these fields need to be merged.
Here again the first six queries: 12585.885 0.01 1322443 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = S 7524.278 0.001 6814856 drupal_lookup_path SELECT dst FROM url_alias WHERE src = S 6062.889 0.107 56508 theme_forum_topic_navigation SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = D 4343.435 0.024 183059 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = D 1837.531 0.04 45429 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = S 1538.485 0.011 143889 cache_set INSERT INTO cache (cid, data, created, expire, headers) VALUES (S)
Here are the first few queries by average query time, columns as above.
155.447 3.048 51 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN users u ON n.uid = u.uid LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 186.896 2.967 63 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D 94.058 1.844 51 pager_query SELECT COUNT(DISTINCT(n.nid)) FROM node n LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 109.691 1.155 95 pager_query SELECT DISTINCT m.*, r.*, u.name, u.uid FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT D 1.03 1.03 1 pager_query SELECT a.aid, a.timestamp, a.url, a.uid, u.name FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S ORDER BY a.timestamp DESC LIMIT D 2.536 0.845 3 comment_edit SELECT c.*, u.uid, u.name AS registered_name, u.data FROM comments c INNER JOIN users u ON c.uid = u.uid WHERE c.cid = D 0.614 0.614 1 eval select message, count(message) as count from watchdog where type = S and uid = D 0.434 0.434 1 aggregator_page_sources SELECT f.fid, f.title, f.description, f.image, MAX(i.timestamp) AS last FROM aggregator_feed f LEFT JOIN aggregator_item i ON f.fid = i.fid GROUP BY f.fid, f.title, f.description, f.image ORDER BY last DESC, f.title 0.429 0.429 1 pager_query SELECT q.qid, q.query, q.function, t.*, COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time) AS average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC LIMIT D
1343.749 3.262 412 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D 1045.001 2.969 352 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN users u ON n.uid = u.uid LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 628.091 1.784 352 pager_query SELECT COUNT(DISTINCT(n.nid)) FROM node n LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 1091.909 1.327 823 pager_query SELECT DISTINCT m.*, r.*, u.name, u.uid FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT D 19.668 1.093 18 eval select message, count(message) as count from watchdog where type = S and uid = D 147.989 1.08 137 cache_get DELETE FROM cache WHERE expire != D 1.03 1.03 1 pager_query SELECT a.aid, a.timestamp, a.url, a.uid, u.name FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S ORDER BY a.timestamp DESC LIMIT D 10.087 0.56 18 eval select message, count(message) as count from watchdog where type = S and uid != D 136.43 0.493 277 pager_query SELECT COUNT(DISTINCT(n.nid)) FROM node n INNER JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r ON r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN project_comments c ON c.nid = p.nid WHERE n.status = D 132.688 0.479 277 pager_query SELECT DISTINCT(n.nid) FROM node n INNER JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r ON r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN project_comments c ON c.nid = p.nid WHERE n.status = D 1247.877 0.473 2640 do_search CREATE TEMPORARY TABLE temp_search_sids SELECT i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS matches FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER JOIN node n ON n.nid = i.sid INNER JOIN users u ON n.uid = u.uid WHERE n.status = D 0.933 0.466 2 aggregator_page_sources SELECT f.fid, f.title, f.description, f.image, MAX(i.timestamp) AS last FROM aggregator_feed f LEFT JOIN aggregator_item i ON f.fid = i.fid GROUP BY f.fid, f.title, f.description, f.image ORDER BY last DESC, f.title 0.429 0.429 1 pager_query SELECT q.qid, q.query, q.function, t.*, COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time) AS average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC LIMIT D 9.767 0.425 23 do_search CREATE TEMPORARY TABLE temp_search_sids SELECT i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS matches FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER JOIN node n ON n.nid = i.sid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN users u ON n.uid = u.uid WHERE n.status = D 6.014 0.401 15 pager_query SELECT uid, access FROM users WHERE uid > D 0.372 0.372 1 pager_query SELECT COUNT(*) FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S 96.123 0.364 264 forum_get_forums SELECT r.tid, COUNT(n.nid) AS topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid WHERE n.status = D 42.274 0.282 150 cache_clear_all DELETE FROM cache WHERE cid = S
The queries have been collected on every tenth page view, so you should multiply the count and total time values by ten for real values.
I will make the collected data available to interested parties. Cheers, Gerhard
On Fri, July 28, 2006 9:39 am, Gerhard Killesreiter said:
I have now stopped the data collecting.
Here are the first queries by accumulated time.
The first column is accumulated time, the second average time/query, the third the number of times the query was run, the fourth the function which calls the query, the last the query itself:
12585.885 0.01 1322443 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = S 7524.278 0.001 6814856 drupal_lookup_path SELECT dst FROM url_alias WHERE src = S 6062.889 0.107 56508 theme_forum_topic_navigation SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = D 4343.435 0.024 183059 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = D 1837.531 0.04 45429 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = S 1538.485 0.011 143889 cache_set INSERT INTO cache (cid, data, created, expire, headers) VALUES (S)
Here are the first few queries by average query time, columns as above.
1343.749 3.262 412 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D 1045.001 2.969 352 pager_query SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN users u ON n.uid = u.uid LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 628.091 1.784 352 pager_query SELECT COUNT(DISTINCT(n.nid)) FROM node n LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D 1091.909 1.327 823 pager_query SELECT DISTINCT m.*, r.*, u.name, u.uid FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT D 19.668 1.093 18 eval select message, count(message) as count from watchdog where type = S and uid = D 147.989 1.08 137 cache_get DELETE FROM cache WHERE expire != D 1.03 1.03 1 pager_query SELECT a.aid, a.timestamp, a.url, a.uid, u.name FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S ORDER BY a.timestamp DESC LIMIT D 10.087 0.56 18 eval select message, count(message) as count from watchdog where type = S and uid != D 136.43 0.493 277 pager_query SELECT COUNT(DISTINCT(n.nid)) FROM node n INNER JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r ON r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN project_comments c ON c.nid = p.nid WHERE n.status = D 132.688 0.479 277 pager_query SELECT DISTINCT(n.nid) FROM node n INNER JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r ON r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN project_comments c ON c.nid = p.nid WHERE n.status = D 1247.877 0.473 2640 do_search CREATE TEMPORARY TABLE temp_search_sids SELECT i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS matches FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER JOIN node n ON n.nid = i.sid INNER JOIN users u ON n.uid = u.uid WHERE n.status = D 0.933 0.466 2 aggregator_page_sources SELECT f.fid, f.title, f.description, f.image, MAX(i.timestamp) AS last FROM aggregator_feed f LEFT JOIN aggregator_item i ON f.fid = i.fid GROUP BY f.fid, f.title, f.description, f.image ORDER BY last DESC, f.title 0.429 0.429 1 pager_query SELECT q.qid, q.query, q.function, t.*, COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time) AS average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC LIMIT D 9.767 0.425 23 do_search CREATE TEMPORARY TABLE temp_search_sids SELECT i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS matches FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER JOIN node n ON n.nid = i.sid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN users u ON n.uid = u.uid WHERE n.status = D 6.014 0.401 15 pager_query SELECT uid, access FROM users WHERE uid > D 0.372 0.372 1 pager_query SELECT COUNT(*) FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S 96.123 0.364 264 forum_get_forums SELECT r.tid, COUNT(n.nid) AS topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid WHERE n.status = D 42.274 0.282 150 cache_clear_all DELETE FROM cache WHERE cid = S
To my admittedly non-expert eye, this suggests a few (probably obvious, but I'll say them anyway) observations: - The cache system and path alias system are far and away the most database-intensive systems in Drupal, by over an order of magnitude. In both cases, though, it's more due to frequency of call than to complexity. Simply reducing the number of queries should help more than trying to simplify them. - Someone needs to get that query out of the forum module's theme code. :-) - Node loading is also db-intensive (who'd a thunk it, right?) - If node_load()'s main query is called that often, I wonder what else it's doing in the same routine? A given node is loaded by several queries, which would probably make the function more expensive than it seems from these logs. - Pagers are expensive, which I don't think is much of a surprise. - Flushing the cache is non-cheap. I recall a while back someone mentioning MySQL's "deferred" functionality, which could be useful here. (If it takes an extra half-second for a cache-flush to commit, that's probably not a problem for the business logic.) I'm sure some other database experts will have more insightful commentary. :-) --Larry Garfield
On Jul 28, 2006, at 8:49 AM, Larry Garfield wrote:
- Flushing the cache is non-cheap. I recall a while back someone mentioning MySQL's "deferred" functionality, which could be useful here. (If it takes an extra half-second for a cache-flush to commit, that's probably not a problem for the business logic.)
Right. The key here is that we do not use auto-increment in this table. MySQL inserts and deletes assume auto-increments and there are some optimizations that can be done with the LOW-PRIORITY or QUICK for the DELETE command.
147.989 1.08 137 cache_get DELETE FROM cache WHERE expire != D
Could be: delete quick from cache where expire!=D; optimize cache; It's been suggested that Drupal can support conditional non-ANSI SQL statements that at DB specific. Not sure how to check for MySQL version and if that should be done only once so we can run MySQL specific queries. Cheers, Kieran
On Fri, July 28, 2006 11:58 am, Kieran Lal said:
On Jul 28, 2006, at 8:49 AM, Larry Garfield wrote:
- Flushing the cache is non-cheap. I recall a while back someone mentioning MySQL's "deferred" functionality, which could be useful here. (If it takes an extra half-second for a cache-flush to commit, that's probably not a problem for the business logic.)
Right. The key here is that we do not use auto-increment in this table. MySQL inserts and deletes assume auto-increments and there are some optimizations that can be done with the LOW-PRIORITY or QUICK for the DELETE command.
147.989 1.08 137 cache_get DELETE FROM cache WHERE expire != D
Could be: delete quick from cache where expire!=D; optimize cache;
It's been suggested that Drupal can support conditional non-ANSI SQL statements that at DB specific. Not sure how to check for MySQL version and if that should be done only once so we can run MySQL specific queries.
Cheers, Kieran
My thought would be something like db_query_lowpri(). The MySQL version of that would do the necessary modifications before running db_query() (INSERT DELAYED vs DELETE QUICK, or whatever), while the Postgres version would simply pass it straight through (or do Postgres-specific "run this whenever" modifications, about which I know nothing). Dries has already vetoed dedicated insert and delete API functions, which would have been the perfect place for a low-priority flag parameter. :-( --Larry Garfield
On Jul 28, 2006, at 7:39 AM, Gerhard Killesreiter wrote:
1091.909 1.327 823 pager_query SELECT DISTINCT m.*, r.*, u.name, u.uid FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT D
last time these queries came up, i started looking into this one. http://drupal.org/node/74238 i didn't have a chance to finish my investigation, but it's pretty obvious that some of the keys in the cvs_* tables aren't right. if anyone wants to look at this more closely with me at some point, please send me email off-list, or find me in IRC. thanks, -dww
On Jul 28, 2006, at 7:39 AM, Gerhard Killesreiter wrote:
147.989 1.08 137 cache_get DELETE FROM cache WHERE expire != D
I ran tests deleting from a cache of CivicSpaceLabs.org with 10, 000 rows in the cache. delete from cache where expire != D 1.57 second 10K rows delete quick from cache where expire != 1154158000; Query OK, 10170 rows affected (0.96 sec) optimize table cache; 1 row in set (0.21 sec) Drupal's cache has around 80, 000 rows, 3500 pages, so I think it might be worthwhile to use delete. Cheers, Kieran
Gerhard Killesreiter wrote:
Gerhard Killesreiter wrote:
Since last night, I let the devel module collect query statistics on drupal.org. We now can see which queries are run very often and which are just slow.
I have now stopped the data collecting.
I did a second run of collecting data, this time with my patch for better caching applied (http://drupal.org/node/72617). This patch in its current form caches certain elements created by forum module and tries to invalidate them only when really needed.
Here are the first queries by accumulated time. The first column is accumulated time, the second average time/query, the third the number of times the query was run, the fourth the function which calls the query, the last the query itself:
Here again the first six queries:
12585.885 0.01 1322443 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = S 7524.278 0.001 6814856 drupal_lookup_path SELECT dst FROM url_alias WHERE src = S 6062.889 0.107 56508 theme_forum_topic_navigation SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = D 4343.435 0.024 183059 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = D 1837.531 0.04 45429 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = S 1538.485 0.011 143889 cache_set INSERT INTO cache (cid, data, created, expire, headers) VALUES (S)
Here the same queries for comparison: 5684.353 0.007 765611 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = S 2848.273 0.001 3187992 drupal_lookup_path SELECT dst FROM url_alias WHERE src = S 2100.561 0.207 10170 theme_forum_topic_navigation SELECT n.nid, n.title, n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.nid = r.nid AND r.tid = D 1754.965 0.019 94054 node_load SELECT n.nid, n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE n.nid = D 942.805 0.01 97798 db_lock_table LOCK TABLES cache WRITE 706.889 0.008 86837 cache_set INSERT INTO cache (cid, data, created, expire, headers) VALUES (S) Since I did run this second test for less time than the previous one, the numbers obviously differ. However, there is a way to compare them: We use the drupal_lookup_path as a gauge and assume that per time unit the amount of these queries is a constant. I could also look at the total number of queries, but I'd have to load the other tables again to obtain it. ;) For the first set of queries there have been about 5 times as many lookup path queries than cache_get queries. For the second time this factor is only about 4. This was to be expected: More cached stuff needs more cache_get queries. The ratio of lookup path vs theme_forum_topic_navigation queries changed from about 120 to about 310, so we have much less queries of the forum topic kind and can infer that this kind of caching is rather efficient. There are about 3000 cache entries in the database that were generated by my patch. You will also observe that if you look at the time that is spend doing this query, we don't gain much... While we don't execute as many queries, the individual query is more expensive by almost a factor of 2. I attribute ths to the fact that the query didn't get cached by the MySQL query cache anymore since it is run less frequent. But we didn't want to focus on the query cache anyway. ;) There is also another way to make this particular query more efficient: http://drupal.org/node/76324 Cheers, Gerhard
participants (10)
-
Derek Wright -
Dries Buytaert -
Earl Miles -
Fouad Riaz Bajwa -
Gabor Hojtsy -
Gerhard Killesreiter -
Kieran Lal -
Larry Garfield -
Moshe Weitzman -
Neil Drumm