Larry-<br><br>I'm not claiming to be an expert, but two approaches that we used in 4.6 on SavannahNow.com can really cut db queries.<br><br>I have opened issues for both.<br><br>The first covers path aliasing and allows admins to define the lookup paths that are allowed. In practice, it removes all 'admin' and similar paths from drupal_lookup_path().
<br><br><a href="http://drupal.org/node/75976">http://drupal.org/node/75976</a><br><br>The second is object-caching. The _load functions in Drupal tend to run more often than ideal. Repetitive node_load() calls are lkely due to different modules having to run node_load() to ensure that the node object they need is available,
<br><br>Object caching _might_ help with this.<br><br><a href="http://drupal.org/node/74020">http://drupal.org/node/74020</a><br><br>These are -- I should stress -- both experiments. They need to be tested in large environments (like
<a href="http://drupal.org">drupal.org</a>, or <a href="http://scratch.drupal.org">scratch.drupal.org</a>) to see if they actually buy us any performance. <div><div><br>- Ken Rickard<br>agentrickard<br> </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Message: 5<br>Date: Fri, 28 Jul 2006 10:49:07 -0500 (CDT)<br>From: "Larry Garfield" <<a href="mailto:larry@garfieldtech.com">larry@garfieldtech.com</a>><br>Subject: Re: [development] Database queries on <a href="http://drupal.org">
drupal.org</a><br>To: <a href="mailto:development@drupal.org">development@drupal.org</a><br>Message-ID:<br> <<a href="mailto:1731.70.90.124.113.1154101747.squirrel@webmail.garfieldtech.com">1731.70.90.124.113.1154101747.squirrel@webmail.garfieldtech.com
</a>><br>Content-Type: text/plain;charset=iso-8859-1<br><br>On Fri, July 28, 2006 9:39 am, Gerhard Killesreiter said:<br><br>> I have now stopped the data collecting.<br>><br>>> Here are the first queries by accumulated time.
<br><br>> The first column is accumulated time, the second average time/query, the<br>> third the number of times the query was run, the fourth the function<br>> which calls the query, the last the query itself:<br>
<br>> 12585.885 0.01 1322443 cache_get SELECT data, created, headers, expire<br>> FROM cache WHERE cid = S<br>> 7524.278 0.001 6814856 drupal_lookup_path SELECT dst FROM url_alias<br>> WHERE src = S
<br>> 6062.889 0.107 56508 theme_forum_topic_navigation SELECT n.nid, n.title,<br>> n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER<br>> JOIN node_comment_statistics l ON n.nid =
l.nid INNER JOIN term_node r<br>> ON n.nid = r.nid AND r.tid = D<br>> 4343.435 0.024 183059 node_load SELECT n.nid, n.vid, n.type, n.status,<br>> n.created, n.changed, n.comment, n.promote, n.moderate
, n.sticky,<br>> r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log,<br>> r.format, u.uid, <a href="http://u.name">u.name</a>, u.picture, u.data FROM node n INNER JOIN users<br>> u ON u.uid = n.uid
INNER JOIN node_revisions r ON r.vid = n.vid WHERE<br>> n.nid = D<br>> 1837.531 0.04 45429 node_load SELECT n.nid, n.vid, n.type, n.status,<br>> n.created, n.changed, n.comment, n.promote, n.moderate
, n.sticky,<br>> r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log,<br>> r.format, u.uid, <a href="http://u.name">u.name</a>, u.picture, u.data FROM node n INNER JOIN users<br>> u ON u.uid = n.uid
INNER JOIN node_revisions r ON r.vid = n.vid WHERE<br>> n.nid = S<br>> 1538.485 0.011 143889 cache_set INSERT INTO cache (cid, data, created,<br>> expire, headers) VALUES (S)<br><br><br>>> Here are the first few queries by average query time, columns as above.
<br><br>> 1343.749 3.262 412 pager_query SELECT DISTINCT(n.nid), n.title, n.type,<br>> n.changed, n.uid, <a href="http://u.name">u.name</a>, l.last_comment_timestamp AS last_post,<br>> l.comment_count
FROM node n INNER JOIN users u ON n.uid = u.uid INNER<br>> JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D<br>> 1045.001 2.969 352 pager_query SELECT DISTINCT(n.nid), n.title, n.type
,<br>> n.changed, n.uid, <a href="http://u.name">u.name</a>, l.last_comment_timestamp AS last_post,<br>> l.comment_count FROM node n INNER JOIN node_comment_statistics l ON<br>> n.nid = l.nid INNER JOIN users u ON
n.uid = u.uid LEFT JOIN comments c<br>> ON n.nid = c.nid AND (c.status = D<br>> 628.091 1.784 352 pager_query SELECT COUNT(DISTINCT(n.nid)) FROM node n<br>> LEFT JOIN comments c ON n.nid = c.nid AND (
c.status = D<br>> 1091.909 1.327 823 pager_query SELECT DISTINCT m.*, r.*, <a href="http://u.name">u.name</a>, u.uid<br>> FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER<br>> JOIN users u ON
m.uid = u.uid ORDER BY m.created DESC LIMIT D<br>> 19.668 1.093 18 eval select message, count(message) as count from<br>> watchdog where type = S and uid = D<br>> 147.989 1.08 137 cache_get DELETE FROM cache WHERE expire != D
<br>> 1.03 1.03 1 pager_query SELECT a.aid, a.timestamp, a.url, a.uid, <a href="http://u.name">u.name</a><br>> FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S<br>> ORDER BY a.timestamp
DESC LIMIT D<br>> 10.087 0.56 18 eval select message, count(message) as count from<br>> watchdog where type = S and uid != D<br>> 136.43 0.493 277 pager_query SELECT COUNT(DISTINCT(
n.nid)) FROM node n<br>> INNER JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r<br>> ON r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN<br>> project_comments c ON c.nid = p.nid WHERE
n.status = D<br>> 132.688 0.479 277 pager_query SELECT DISTINCT(n.nid) FROM node n INNER<br>> JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r ON<br>> r.vid = n.vid INNER JOIN users u ON
p.assigned = u.uid LEFT JOIN<br>> project_comments c ON c.nid = p.nid WHERE n.status = D<br>> 1247.877 0.473 2640 do_search CREATE TEMPORARY TABLE temp_search_sids<br>> SELECT i.type, i.sid, SUM(i.score
* t.count) AS relevance, COUNT(*) AS<br>> matches FROM search_index i INNER JOIN search_total t ON i.word = t.word<br>> INNER JOIN node n ON n.nid = i.sid INNER JOIN users u ON n.uid = u.uid<br>> WHERE n.status = D
<br>> 0.933 0.466 2 aggregator_page_sources SELECT f.fid, f.title,<br>> f.description, f.image, MAX(i.timestamp) AS last FROM aggregator_feed f<br>> LEFT JOIN aggregator_item i ON f.fid = i.fid GROUP BY f.fid
, f.title,<br>> f.description, f.image ORDER BY last DESC, f.title<br>> 0.429 0.429 1 pager_query SELECT q.qid, q.query, q.function, t.*,<br>> COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time
) AS<br>> average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN<br>> devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC<br>> LIMIT D<br>> 9.767 0.425 23 do_search CREATE TEMPORARY TABLE temp_search_sids SELECT
<br>> i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS matches<br>> FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER<br>> JOIN node n ON n.nid = i.sid INNER JOIN term_node tn ON
n.nid = tn.nid<br>> INNER JOIN users u ON n.uid = u.uid WHERE n.status = D<br>> 6.014 0.401 15 pager_query SELECT uid, access FROM users WHERE uid > D<br>> 0.372 0.372 1 pager_query SELECT COUNT(*) FROM accesslog a LEFT JOIN
<br>> users u ON a.uid = u.uid WHERE a.path LIKE S<br>> 96.123 0.364 264 forum_get_forums SELECT r.tid, COUNT(n.nid) AS<br>> topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER
<br>> JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r<br>> ON n.nid = r.nid WHERE n.status = D<br>> 42.274 0.282 150 cache_clear_all DELETE FROM cache WHERE cid = S<br><br><br>
To my admittedly non-expert eye, this suggests a few (probably obvious,<br>but I'll say them anyway) observations:<br><br>- The cache system and path alias system are far and away the most<br>database-intensive systems in Drupal, by over an order of magnitude. In
<br>both cases, though, it's more due to frequency of call than to complexity.<br> Simply reducing the number of queries should help more than trying to<br>simplify them.<br><br>- Someone needs to get that query out of the forum module's theme code. :-)
<br><br>- Node loading is also db-intensive (who'd a thunk it, right?)<br><br>- If node_load()'s main query is called that often, I wonder what else<br>it's doing in the same routine? A given node is loaded by several<br>
queries, which would probably make the function more expensive than it<br>seems from these logs.<br><br>- Pagers are expensive, which I don't think is much of a surprise.<br><br>- Flushing the cache is non-cheap. I recall a while back someone
<br>mentioning MySQL's "deferred" functionality, which could be useful here.<br>(If it takes an extra half-second for a cache-flush to commit, that's<br>probably not a problem for the business logic.)<br><br>I'm sure some other database experts will have more insightful commentary.
<br>:-)<br><br>--Larry Garfield<br><br><br><br></blockquote></div><br>