[drupal-devel] [bug] speed up "Top pages in past n days" page

drumm drupal-devel at drupal.org
Tue Sep 6 19:41:21 UTC 2005


Issue status update for 
http://drupal.org/node/28924
Post a follow up: 
http://drupal.org/project/comments/add/28924

 Project:      Drupal
 Version:      cvs
 Component:    statistics.module
 Category:     bug reports
 Priority:     minor
 Assigned to:  Jeremy at kerneltrap.org
 Reported by:  Jeremy at kerneltrap.org
 Updated by:   drumm
 Status:       patch (code needs review)

MySQL's insert delayed extension would be perfect for this.
Unfortunately it is not ANSI SQL.


The inserts are done in the exit hook so the extra time is not usually
passed on to the user (when drupal_goto() is used the exit hook
execution happens before the redirect is sent). Although I'm guessing
the total index maintenance time may be larger than the savings on the
statistics pages.




drumm



Previous comments:
------------------------------------------------------------------------

Mon, 15 Aug 2005 17:26:07 +0000 : Jeremy at kerneltrap.org

Attachment: http://drupal.org/files/issues/statistics.module-cvs_0.patch (615 bytes)

Displaying statistics pages can be slow.  The attached patch removes one
of the "GROUP BY" columns to increase the speed of generating the "Top
pages in the past n days" page.


(Grouping by 'title' does not work, as there are many different paths
that can have the same title.  Instead, 'path' is unique for each page,
so it is a more logical column to group by.  I tested this on my active
webpage to verify that the resulting page was what I expected.)




------------------------------------------------------------------------

Tue, 16 Aug 2005 18:08:55 +0000 : drumm

How about a key on the path column?




------------------------------------------------------------------------

Tue, 16 Aug 2005 19:48:46 +0000 : Dries

Committed to HEAD.  Marking this active until we clarifie the path index
thing.




------------------------------------------------------------------------

Wed, 17 Aug 2005 00:18:51 +0000 : Jeremy at kerneltrap.org

In which case several of the columns should have keys.  I was going to
add it to my earlier patch, but haven't had time.




------------------------------------------------------------------------

Sun, 21 Aug 2005 11:36:43 +0000 : Cvbge

SQL code as in the patch won't work with postgresql:



dt=> SELECT COUNT(path) AS hits, path, title, AVG(timer) AS
average_time, SUM(timer) AS total_time FROM accesslog GROUP BY path;
ERROR:  column "accesslog.title" must appear in the GROUP BY clause or
be used in an aggregate function



------------------------------------------------------------------------

Sat, 27 Aug 2005 16:12:59 +0000 : Jeremy at kerneltrap.org

Attachment: http://drupal.org/files/issues/statistics_9.patch (1.16 KB)

The attached adds three keys that I confirmed are used.  The keys are on
path, url and uid.


Before adding a key for "path":



EXPLAIN SELECT a.aid, a.timestamp, a.url, a.uid, u.name FROM accesslog
a LEFT JOIN users u ON a.uid = u.uid WHERE a.path LIKE 'node/1%%'

table	type	possible_keys	key	key_len	ref	rows	Extra
a	ALL					75	Using where
u	eq_ref	PRIMARY	PRIMARY	4	a.uid	1	

After:



table	type	possible_keys	key	key_len	ref	rows	Extra
a	range	path	path	256		3	Using where
u	eq_ref	PRIMARY	PRIMARY	4	a.uid	1	

And another query that gains from the "path" key:



EXPLAIN SELECT COUNT(DISTINCT(path)) FROM accesslog;

table   	type	possible_keys	key	key_len	ref	rows	Extra
accesslog	index		path	256		75	Using index

Before adding the key for "url":



EXPLAIN SELECT COUNT(DISTINCT(url)) FROM accesslog WHERE url  '' AND
url NOT LIKE '%%node%%';

le      	type	possible_keys	key	key_len	ref	rows	Extra
accesslog	ALL					75	Using where

And after:



table   	type	possible_keys	key	key_len	ref	rows	Extra
accesslog	index		url	256		75	Using where; Using index

Before adding the key for "uid":



EXPLAIN SELECT COUNT(DISTINCT(uid)) FROM accesslog;

Result
table   	type	possible_keys	key	key_len	ref	rows	Extra
accesslog	ALL					75	

After:



table   	type	possible_keys	key	key_len	ref	rows	Extra
accesslog	index		uid	5		75	Using index



------------------------------------------------------------------------

Fri, 02 Sep 2005 15:37:05 +0000 : moshe weitzman

sure, these indices speed up the admin pages. but remember that every
index needsb to be maintained for every insert. since accesslog is
inserted into on every view, this is potentially harmful to a lot more
people than admins. i'm not sure how to measure this tradeoff.


one approach is to copy the access log table to a read only table and
do admin pages off of the copy. that means we have a copy dedicated to
reading a different one dedicated to writing.







More information about the drupal-devel mailing list