[drupal-devel] [bug] speed up "Top pages in past n days" page
drumm
drupal-devel at drupal.org
Tue Sep 6 19:41:21 UTC 2005
Issue status update for
http://drupal.org/node/28924
Post a follow up:
http://drupal.org/project/comments/add/28924
Project: Drupal
Version: cvs
Component: statistics.module
Category: bug reports
Priority: minor
Assigned to: Jeremy at kerneltrap.org
Reported by: Jeremy at kerneltrap.org
Updated by: drumm
Status: patch (code needs review)
MySQL's insert delayed extension would be perfect for this.
Unfortunately it is not ANSI SQL.
The inserts are done in the exit hook so the extra time is not usually
passed on to the user (when drupal_goto() is used the exit hook
execution happens before the redirect is sent). Although I'm guessing
the total index maintenance time may be larger than the savings on the
statistics pages.
drumm
Previous comments:
------------------------------------------------------------------------
Mon, 15 Aug 2005 17:26:07 +0000 : Jeremy at kerneltrap.org
Attachment: http://drupal.org/files/issues/statistics.module-cvs_0.patch (615 bytes)
Displaying statistics pages can be slow. The attached patch removes one
of the "GROUP BY" columns to increase the speed of generating the "Top
pages in the past n days" page.
(Grouping by 'title' does not work, as there are many different paths
that can have the same title. Instead, 'path' is unique for each page,
so it is a more logical column to group by. I tested this on my active
webpage to verify that the resulting page was what I expected.)
------------------------------------------------------------------------
Tue, 16 Aug 2005 18:08:55 +0000 : drumm
How about a key on the path column?
------------------------------------------------------------------------
Tue, 16 Aug 2005 19:48:46 +0000 : Dries
Committed to HEAD. Marking this active until we clarifie the path index
thing.
------------------------------------------------------------------------
Wed, 17 Aug 2005 00:18:51 +0000 : Jeremy at kerneltrap.org
In which case several of the columns should have keys. I was going to
add it to my earlier patch, but haven't had time.
------------------------------------------------------------------------
Sun, 21 Aug 2005 11:36:43 +0000 : Cvbge
SQL code as in the patch won't work with postgresql:
dt=> SELECT COUNT(path) AS hits, path, title, AVG(timer) AS
average_time, SUM(timer) AS total_time FROM accesslog GROUP BY path;
ERROR: column "accesslog.title" must appear in the GROUP BY clause or
be used in an aggregate function
------------------------------------------------------------------------
Sat, 27 Aug 2005 16:12:59 +0000 : Jeremy at kerneltrap.org
Attachment: http://drupal.org/files/issues/statistics_9.patch (1.16 KB)
The attached adds three keys that I confirmed are used. The keys are on
path, url and uid.
Before adding a key for "path":
EXPLAIN SELECT a.aid, a.timestamp, a.url, a.uid, u.name FROM accesslog
a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE 'node/1%%'
table type possible_keys key key_len ref rows Extra
a ALL 75 Using where
u eq_ref PRIMARY PRIMARY 4 a.uid 1
After:
table type possible_keys key key_len ref rows Extra
a range path path 256 3 Using where
u eq_ref PRIMARY PRIMARY 4 a.uid 1
And another query that gains from the "path" key:
EXPLAIN SELECT COUNT(DISTINCT(path)) FROM accesslog;
table type possible_keys key key_len ref rows Extra
accesslog index path 256 75 Using index
Before adding the key for "url":
EXPLAIN SELECT COUNT(DISTINCT(url)) FROM accesslog WHERE url '' AND
url NOT LIKE '%%node%%';
le type possible_keys key key_len ref rows Extra
accesslog ALL 75 Using where
And after:
table type possible_keys key key_len ref rows Extra
accesslog index url 256 75 Using where; Using index
Before adding the key for "uid":
EXPLAIN SELECT COUNT(DISTINCT(uid)) FROM accesslog;
Result
table type possible_keys key key_len ref rows Extra
accesslog ALL 75
After:
table type possible_keys key key_len ref rows Extra
accesslog index uid 5 75 Using index
------------------------------------------------------------------------
Fri, 02 Sep 2005 15:37:05 +0000 : moshe weitzman
sure, these indices speed up the admin pages. but remember that every
index needsb to be maintained for every insert. since accesslog is
inserted into on every view, this is potentially harmful to a lot more
people than admins. i'm not sure how to measure this tradeoff.
one approach is to copy the access log table to a read only table and
do admin pages off of the copy. that means we have a copy dedicated to
reading a different one dedicated to writing.
More information about the drupal-devel
mailing list