[development] Database queries on drupal.org

Ken Rickard agentrickard at gmail.com
Sat Jul 29 02:42:01 UTC 2006


Larry-

I'm not claiming to be an expert, but two approaches that we used in 4.6 on
SavannahNow.com can really cut db queries.

I have opened issues for both.

The first covers path aliasing and allows admins to define the lookup paths
that are allowed. In practice, it removes all 'admin' and similar paths from
drupal_lookup_path().

http://drupal.org/node/75976

The second is object-caching.  The _load functions in Drupal tend to run
more often than ideal.  Repetitive node_load() calls are lkely due to
different modules having to run node_load() to ensure that the node object
they need is available,

Object caching _might_ help with this.

http://drupal.org/node/74020

These are -- I should stress -- both experiments.  They need to be tested in
large environments (like drupal.org, or scratch.drupal.org) to see if they
actually buy us any performance.
- Ken Rickard
agentrickard


Message: 5
> Date: Fri, 28 Jul 2006 10:49:07 -0500 (CDT)
> From: "Larry Garfield" <larry at garfieldtech.com>
> Subject: Re: [development] Database queries on drupal.org
> To: development at drupal.org
> Message-ID:
>         <1731.70.90.124.113.1154101747.squirrel at webmail.garfieldtech.com>
> Content-Type: text/plain;charset=iso-8859-1
>
> On Fri, July 28, 2006 9:39 am, Gerhard Killesreiter said:
>
> > I have now stopped the data collecting.
> >
> >> Here are the first queries by accumulated time.
>
> > The first column is accumulated time, the second average time/query, the
> > third the number of times the query was run, the fourth the function
> > which calls the query, the last the query itself:
>
> > 12585.885     0.01    1322443 cache_get       SELECT data, created,
> headers, expire
> > FROM cache WHERE cid = S
> > 7524.278      0.001   6814856 drupal_lookup_path      SELECT dst FROM
> url_alias
> > WHERE src = S
> > 6062.889      0.107   56508   theme_forum_topic_navigation    SELECT
> n.nid, n.title,
> > n.sticky, l.comment_count, l.last_comment_timestamp FROM node n INNER
> > JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r
> > ON n.nid = r.nid AND r.tid = D
> > 4343.435      0.024   183059  node_load       SELECT n.nid, n.vid,
> n.type, n.status,
> > n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky,
> > r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log,
> > r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users
> > u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE
> > n.nid = D
> > 1837.531      0.04    45429   node_load       SELECT n.nid, n.vid,
> n.type, n.status,
> > n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky,
> > r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log,
> > r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN users
> > u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE
> > n.nid = S
> > 1538.485      0.011   143889  cache_set       INSERT INTO cache (cid,
> data, created,
> > expire, headers) VALUES (S)
>
>
> >> Here are the first few queries by average query time, columns as above.
>
> > 1343.749      3.262   412     pager_query     SELECT DISTINCT(n.nid),
> n.title, n.type,
> > n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post,
> > l.comment_count FROM node n INNER JOIN users u ON n.uid = u.uid INNER
> > JOIN node_comment_statistics l ON n.nid = l.nid WHERE n.status = D
> > 1045.001      2.969   352     pager_query     SELECT DISTINCT(n.nid),
> n.title, n.type,
> > n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post,
> > l.comment_count FROM node n INNER JOIN node_comment_statistics l ON
> > n.nid = l.nid INNER JOIN users u ON n.uid = u.uid LEFT JOIN comments c
> > ON n.nid = c.nid AND (c.status = D
> > 628.091       1.784   352     pager_query     SELECT COUNT(DISTINCT(
> n.nid)) FROM node n
> > LEFT JOIN comments c ON n.nid = c.nid AND (c.status = D
> > 1091.909      1.327   823     pager_query     SELECT DISTINCT m.*, r.*,
> u.name, u.uid
> > FROM cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER
> > JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT D
> > 19.668        1.093   18      eval    select message, count(message) as
> count from
> > watchdog where type = S and uid = D
> > 147.989       1.08    137     cache_get       DELETE FROM cache WHERE
> expire != D
> > 1.03  1.03    1       pager_query     SELECT a.aid, a.timestamp, a.url,
> a.uid, u.name
> > FROM accesslog a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE S
> > ORDER BY a.timestamp DESC LIMIT D
> > 10.087        0.56    18      eval    select message, count(message) as
> count from
> > watchdog where type = S and uid != D
> > 136.43        0.493   277     pager_query     SELECT COUNT(DISTINCT(
> n.nid)) FROM node n
> > INNER JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r
> > ON r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN
> > project_comments c ON c.nid = p.nid WHERE n.status = D
> > 132.688       0.479   277     pager_query     SELECT DISTINCT(n.nid)
> FROM node n INNER
> > JOIN project_issues p ON p.nid = n.nid INNER JOIN node_revisions r ON
> > r.vid = n.vid INNER JOIN users u ON p.assigned = u.uid LEFT JOIN
> > project_comments c ON c.nid = p.nid WHERE n.status = D
> > 1247.877      0.473   2640    do_search       CREATE TEMPORARY TABLE
> temp_search_sids
> > SELECT i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS
> > matches FROM search_index i INNER JOIN search_total t ON i.word = t.word
> > INNER JOIN node n ON n.nid = i.sid INNER JOIN users u ON n.uid = u.uid
> > WHERE n.status = D
> > 0.933 0.466   2       aggregator_page_sources SELECT f.fid, f.title,
> > f.description, f.image, MAX(i.timestamp) AS last FROM aggregator_feed f
> > LEFT JOIN aggregator_item i ON f.fid = i.fid GROUP BY f.fid, f.title,
> > f.description, f.image ORDER BY last DESC, f.title
> > 0.429 0.429   1       pager_query     SELECT q.qid, q.query, q.function,
> t.*,
> > COUNT(t.qid) AS count, SUM(t.time) AS total_time, AVG(t.time) AS
> > average, STDDEV(t.time) AS stddev FROM devel_queries q INNER JOIN
> > devel_times t ON q.qid = t.qid GROUP BY t.qid ORDER BY total_time DESC
> > LIMIT D
> > 9.767 0.425   23      do_search       CREATE TEMPORARY TABLE
> temp_search_sids SELECT
> > i.type, i.sid, SUM(i.score * t.count) AS relevance, COUNT(*) AS matches
> > FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER
> > JOIN node n ON n.nid = i.sid INNER JOIN term_node tn ON n.nid = tn.nid
> > INNER JOIN users u ON n.uid = u.uid WHERE n.status = D
> > 6.014 0.401   15      pager_query     SELECT uid, access FROM users
> WHERE uid > D
> > 0.372 0.372   1       pager_query     SELECT COUNT(*) FROM accesslog a
> LEFT JOIN
> > users u ON a.uid = u.uid WHERE a.path LIKE S
> > 96.123        0.364   264     forum_get_forums        SELECT r.tid,
> COUNT(n.nid) AS
> > topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER
> > JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r
> > ON n.nid = r.nid WHERE n.status = D
> > 42.274        0.282   150     cache_clear_all DELETE FROM cache WHERE
> cid = S
>
>
> To my admittedly non-expert eye, this suggests a few (probably obvious,
> but I'll say them anyway) observations:
>
> - The cache system and path alias system are far and away the most
> database-intensive systems in Drupal, by over an order of magnitude.  In
> both cases, though, it's more due to frequency of call than to complexity.
> Simply reducing the number of queries should help more than trying to
> simplify them.
>
> - Someone needs to get that query out of the forum module's theme code.
> :-)
>
> - Node loading is also db-intensive (who'd a thunk it, right?)
>
> - If node_load()'s main query is called that often, I wonder what else
> it's doing in the same routine?  A given node is loaded by several
> queries, which would probably make the function more expensive than it
> seems from these logs.
>
> - Pagers are expensive, which I don't think is much of a surprise.
>
> - Flushing the cache is non-cheap.  I recall a while back someone
> mentioning MySQL's "deferred" functionality, which could be useful here.
> (If it takes an extra half-second for a cache-flush to commit, that's
> probably not a problem for the business logic.)
>
> I'm sure some other database experts will have more insightful commentary.
> :-)
>
> --Larry Garfield
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/development/attachments/20060729/2d699c53/attachment-0001.htm


More information about the development mailing list