[development] Database queries on drupal.org

Larry Garfield larry at garfieldtech.com
Fri Jul 28 15:49:07 UTC 2006


On Fri, July 28, 2006 9:39 am, Gerhard Killesreiter said:

> I have now stopped the data collecting.
>
>> Here are the first queries by accumulated time.

> The first column is accumulated time, the second average time/query, the
> third the number of times the query was run, the fourth the function
> which calls the query, the last the query itself:

> 12585.885	0.01	1322443	cache_get	SELECT data, created, headers, expire
> FROM cache WHERE cid = S
> 7524.278	0.001	6814856	drupal_lookup_path	SELECT dst FROM url_alias
> WHERE src = S
> 6062.889	0.107	56508	theme_forum_topic_navigation	SELECT n.nid, n.title,
> n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER
> JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r
> ON n.nid = r.nid AND r.tid = D
> 4343.435	0.024	183059	node_load	SELECT n.nid, n.vid, n.type, n.status,
> n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky,
> r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log,
> r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users
> u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE
> n.nid = D
> 1837.531	0.04	45429	node_load	SELECT n.nid, n.vid, n.type, n.status,
> n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky,
> r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log,
> r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users
> u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE
> n.nid = S
> 1538.485	0.011	143889	cache_set	INSERT INTO cache (cid, data, created,
> expire, headers) VALUES (S)


>> Here are the first few queries by average query time, columns as above.

> 1343.749	3.262	412	pager_query	SELECT DISTINCT(n.nid), n.title, n.type,
> n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post,
> l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER
> JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D
> 1045.001	2.969	352	pager_query	SELECT DISTINCT(n.nid), n.title, n.type,
> n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post,
> l.comment_count FROM node n INNER JOIN node_comment_statistics l ON
> n.nid = l.nid INNER JOIN users u ON n.uid = u.uid LEFT JOIN comments c
> ON n.nid = c.nid AND (c.status = D
> 628.091	1.784	352	pager_query	SELECT COUNT(DISTINCT(n.nid)) FROM node n
> LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D
> 1091.909	1.327	823	pager_query	SELECT DISTINCT m.*, r.*, u.name, u.uid
> FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER
> JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT D
> 19.668	1.093	18	eval	select message, count(message) as count from
> watchdog where type = S and uid = D
> 147.989	1.08	137	cache_get	DELETE FROM cache WHERE expire != D
> 1.03	1.03	1	pager_query	SELECT a.aid, a.timestamp, a.url, a.uid, u.name
> FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S
> ORDER BY a.timestamp DESC LIMIT D
> 10.087	0.56	18	eval	select message, count(message) as count from
> watchdog where type = S and uid != D
> 136.43	0.493	277	pager_query	SELECT COUNT(DISTINCT(n.nid)) FROM node n
> INNER JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r
> ON r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN
> project_comments c ON c.nid = p.nid WHERE n.status = D
> 132.688	0.479	277	pager_query	SELECT DISTINCT(n.nid) FROM node n INNER
> JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r ON
> r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN
> project_comments c ON c.nid = p.nid WHERE n.status = D
> 1247.877	0.473	2640	do_search	CREATE TEMPORARY TABLE temp_search_sids
> SELECT i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS
> matches FROM search_index i INNER JOIN search_total t ON i.word = t.word
> INNER JOIN node n ON n.nid = i.sid INNER JOIN users u ON n.uid = u.uid
> WHERE n.status = D
> 0.933	0.466	2	aggregator_page_sources	SELECT f.fid, f.title,
> f.description, f.image, MAX(i.timestamp) AS last FROM aggregator_feed f
> LEFT JOIN aggregator_item i ON f.fid = i.fid GROUP BY f.fid, f.title,
> f.description, f.image ORDER BY last DESC, f.title
> 0.429	0.429	1	pager_query	SELECT q.qid, q.query, q.function, t.*,
> COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time) AS
> average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN
> devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC
> LIMIT D
> 9.767	0.425	23	do_search	CREATE TEMPORARY TABLE temp_search_sids SELECT
> i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS matches
> FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER
> JOIN node n ON n.nid = i.sid INNER JOIN term_node tn ON n.nid = tn.nid
> INNER JOIN users u ON n.uid = u.uid WHERE n.status = D
> 6.014	0.401	15	pager_query	SELECT uid, access FROM users WHERE uid > D
> 0.372	0.372	1	pager_query	SELECT COUNT(*) FROM accesslog a LEFT JOIN
> users u ON a.uid = u.uid WHERE a.path LIKE S
> 96.123	0.364	264	forum_get_forums	SELECT r.tid, COUNT(n.nid) AS
> topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER
> JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r
> ON n.nid = r.nid WHERE n.status = D
> 42.274	0.282	150	cache_clear_all	DELETE FROM cache WHERE cid = S


To my admittedly non-expert eye, this suggests a few (probably obvious,
but I'll say them anyway) observations:

- The cache system and path alias system are far and away the most
database-intensive systems in Drupal, by over an order of magnitude.  In
both cases, though, it's more due to frequency of call than to complexity.
 Simply reducing the number of queries should help more than trying to
simplify them.

- Someone needs to get that query out of the forum module's theme code. :-)

- Node loading is also db-intensive (who'd a thunk it, right?)

- If node_load()'s main query is called that often, I wonder what else
it's doing in the same routine?  A given node is loaded by several
queries, which would probably make the function more expensive than it
seems from these logs.

- Pagers are expensive, which I don't think is much of a surprise.

- Flushing the cache is non-cheap.  I recall a while back someone
mentioning MySQL's "deferred" functionality, which could be useful here. 
(If it takes an extra half-second for a cache-flush to commit, that's
probably not a problem for the business logic.)

I'm sure some other database experts will have more insightful commentary.
:-)

--Larry Garfield



More information about the development mailing list