MySql Performance Problem
I suppose this is my fault for packing so much into a single query, but I really do want to use tablesort on it. Here's the query (pager_query with a limit of 20) SELECT n.nid, n.title, n.status, n.created, n.uid, ra.energy, nc.totalcount, nc.daycount, ppc.totalcount AS printcount, pmc.sentcount, r.realname, cs.comment_count, v.value AS votes FROM node n INNER JOIN realname r ON r.uid=n.uid INNER JOIN radioactivity ra ON ra.id=n.nid LEFT JOIN node_counter nc ON nc.nid=n.nid LEFT JOIN node_comment_statistics cs ON cs.nid=n.nid LEFT JOIN votingapi_cache v ON v.content_id=n.nid AND v.function='sum' LEFT JOIN print_page_counter ppc ON ppc.path=CONCAT('node/', n.nid) LEFT JOIN print_mail_page_counter pmc ON pmc.path=CONCAT('node/', n.nid) WHERE n.type = 'blog' AND n.status = 1 ORDER BY ra.energy DESC On my test site, this runs really fast - but there are only 6 rows. On the production site, there are 8,000 nodes and this query takes 35 seconds to run. EXPLAIN tells me: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE ra ALL PRIMARY NULL NULL NULL 6 Using temporary; Using filesort 1 SIMPLE n eq_ref PRIMARY,node_status_type,node_type,uid PRIMARY 4 drupal6b.ra.id 1 Using where 1 SIMPLE nc eq_ref PRIMARY PRIMARY 4 drupal6b.n.nid 1 1 SIMPLE cs eq_ref PRIMARY PRIMARY 4 drupal6b.ra.id 1 1 SIMPLE v ALL NULL NULL NULL NULL 3729 1 SIMPLE ppc ALL PRIMARY NULL NULL NULL 2738 1 SIMPLE pmc ALL PRIMARY NULL NULL NULL 2745 1 SIMPLE r eq_ref PRIMARY PRIMARY 4 drupal6b.n.uid 1 Using where With the performance I see, I am not suprised to see "filesort" in there. Can someone suggest ways to improve the performance? Nancy
On Jul 19, 2010, at 10:18 AM, nan wich wrote:
Can someone suggest ways to improve the performance?
First thing to do: indexes on the radioactivity, print_page_counter, print_mail_page_counter and (possibly) votingapi_cache tables. I say possibly on that one because I can't remember the behavior of that table and if indexing it might decrease performance elsewhere. hth, D
I suggest not to index the votingapi tables, they are usually written very often and there adding an index will create an overhead. The join clauses have a lot of string comparison. If you have a good index size, then you should index the string fields first, else this query is always going to be a bottleneck. 8000 nodes is actually not a lot, try removing the WHERE and ON clauses one by one and you might find the solution. Cheers, Mukesh On Mon, Jul 19, 2010 at 11:25 PM, Domenic Santangelo <domenics@gmail.com>wrote:
On Jul 19, 2010, at 10:18 AM, nan wich wrote:
Can someone suggest ways to improve the performance?
First thing to do: indexes on the radioactivity, print_page_counter, print_mail_page_counter and (possibly) votingapi_cache tables. I say possibly on that one because I can't remember the behavior of that table and if indexing it might decrease performance elsewhere.
hth, D
You may ultimately wish to consider the option of grabbing at least some of the data in straightforward single table lookups into PHP hashes and then operating on those using PHP array operations, liberating yourself from so many MySQL joins and string ops. Victor Kane http://awebfactory.com.ar On Mon, Jul 19, 2010 at 3:05 PM, Mukesh Agarwal <mike4u4ever2001@gmail.com>wrote:
I suggest not to index the votingapi tables, they are usually written very often and there adding an index will create an overhead.
The join clauses have a lot of string comparison. If you have a good index size, then you should index the string fields first, else this query is always going to be a bottleneck.
8000 nodes is actually not a lot, try removing the WHERE and ON clauses one by one and you might find the solution.
Cheers, Mukesh
On Mon, Jul 19, 2010 at 11:25 PM, Domenic Santangelo <domenics@gmail.com>wrote:
On Jul 19, 2010, at 10:18 AM, nan wich wrote:
Can someone suggest ways to improve the performance?
First thing to do: indexes on the radioactivity, print_page_counter, print_mail_page_counter and (possibly) votingapi_cache tables. I say possibly on that one because I can't remember the behavior of that table and if indexing it might decrease performance elsewhere.
hth, D
Yes, I had considered that, but then I cannot use tablesort, which would be really nice to have. Nancy ________________________________ From: Victor Kane <victorkane@gmail.com> To: development@drupal.org Sent: Mon, July 19, 2010 2:10:55 PM Subject: Re: [development] MySql Performance Problem You may ultimately wish to consider the option of grabbing at least some of the data in straightforward single table lookups into PHP hashes and then operating on those using PHP array operations, liberating yourself from so many MySQL joins and string ops. Victor Kane http://awebfactory.com.ar On Mon, Jul 19, 2010 at 3:05 PM, Mukesh Agarwal <mike4u4ever2001@gmail.com> wrote: I suggest not to index the votingapi tables, they are usually written very often and there adding an index will create an overhead.
The join clauses have a lot of string comparison. If you have a good index size, then you should index the string fields first, else this query is always going to be a bottleneck.
8000 nodes is actually not a lot, try removing the WHERE and ON clauses one by one and you might find the solution.
Cheers, Mukesh
On Mon, Jul 19, 2010 at 11:25 PM, Domenic Santangelo <domenics@gmail.com>
wrote:
On Jul 19, 2010, at 10:18 AM, nan wich wrote:
Can someone suggest ways to improve the performance?
First thing to do: indexes on the radioactivity, print_page_counter, print_mail_page_counter and (possibly) votingapi_cache tables. I say possibly on that one because I can't remember the behavior of that table and if indexing it might decrease performance elsewhere.
hth, D
Print_page_counter and print_mail_page_counter both have indexes on "path" which is how I am accessing them. I added "AND v.content_type='node'" to the votingapi_cache because it has an index on content_type and content_id. That cut the time in half. Radioactivity has an index on id, class, and decay_profile (in that order). My query has id and class; but does not have decay_profile. Thank you, it is faster now, but I'd still like it even faster. So I now have SELECT n.nid, n.title, n.status, n.created, n.uid, ra.energy, nc.totalcount, nc.daycount, ppc.totalcount AS printcount, pmc.sentcount, r.realname, cs.comment_count, v.value AS votes FROM node n INNER JOIN radioactivity ra ON ra.id=n.nid AND ra.class='node' INNER JOIN realname r ON r.uid=n.uid LEFT JOIN node_counter nc ON nc.nid=n.nid LEFT JOIN node_comment_statistics cs ON cs.nid=n.nid LEFT JOIN votingapi_cache v ON v.content_id=n.nid AND v.function='sum' AND v.content_type='node' LEFT JOIN print_page_counter ppc ON ppc.path=CONCAT('node/', n.nid) LEFT JOIN print_mail_page_counter pmc ON pmc.path=CONCAT('node/', n.nid) WHERE n.type = 'blog' AND n.status = 1 ORDER BY ra.energy DESC LIMIT 10 Nancy ________________________________ From: Domenic Santangelo <domenics@gmail.com> To: development@drupal.org Sent: Mon, July 19, 2010 1:55:48 PM Subject: Re: [development] MySql Performance Problem On Jul 19, 2010, at 10:18 AM, nan wich wrote: Can someone suggest ways to improve the performance? First thing to do: indexes on the radioactivity, print_page_counter, print_mail_page_counter and (possibly) votingapi_cache tables. I say possibly on that one because I can't remember the behavior of that table and if indexing it might decrease performance elsewhere. hth, D
The query is doing a WHERE on the node table and an ORDER BY on the radioactivity table. That will force a temp table no matter what you do. If the temp table is larger than some size (no idea how MySQL determines that size) it will dump it to disk, hence the filesort. Your best bet is to try and reduce the size of the result set as much as possible to keep the temp table small. That is, if you can throw anything else into the WHERE clause on the node table (eg, node type, created recently, etc.) that will probably help. Indexing the votingapi fields you're joining on may help. Don't worry about the write cost. Updating an index on an integer field is pretty fast, and if you're then joining on that field it's going to be a net win in most cases. Try it and see how much difference it makes. --Larry Garfield On 7/19/10 1:11 PM, nan wich wrote:
Print_page_counter and print_mail_page_counter both have indexes on "path" which is how I am accessing them. I added "AND v.content_type='node'" to the votingapi_cache because it has an index on content_type and content_id. That cut the time in half. Radioactivity has an index on id, class, and decay_profile (in that order). My query has id and class; but does not have decay_profile. Thank you, it is faster now, but I'd still like it even faster. So I now have SELECT n.nid, n.title, n.status, n.created, n.uid, ra.energy, nc.totalcount, nc.daycount, ppc.totalcount AS printcount, pmc.sentcount, r.realname, cs.comment_count, v.value AS votes FROM node n INNER JOIN radioactivity ra ON ra.id=n.nid AND ra.class='node' INNER JOIN realname r ON r.uid=n.uid LEFT JOIN node_counter nc ON nc.nid=n.nid LEFT JOIN node_comment_statistics cs ON cs.nid=n.nid LEFT JOIN votingapi_cache v ON v.content_id=n.nid AND v.function='sum' AND v.content_type='node' LEFT JOIN print_page_counter ppc ON ppc.path=CONCAT('node/', n.nid) LEFT JOIN print_mail_page_counter pmc ON pmc.path=CONCAT('node/', n.nid) WHERE n.type = 'blog' AND n.status = 1 ORDER BY ra.energy DESC LIMIT 10
/*Nancy*/
------------------------------------------------------------------------ *From:* Domenic Santangelo <domenics@gmail.com> *To:* development@drupal.org *Sent:* Mon, July 19, 2010 1:55:48 PM *Subject:* Re: [development] MySql Performance Problem
On Jul 19, 2010, at 10:18 AM, nan wich wrote:
Can someone suggest ways to improve the performance?
First thing to do: indexes on the radioactivity, print_page_counter, print_mail_page_counter and (possibly) votingapi_cache tables. I say possibly on that one because I can't remember the behavior of that table and if indexing it might decrease performance elsewhere.
hth, D
What about substring() instead of concat() or adding a column for the nid in ppc and pmc? Cheers, tim
Honestly I can see no good reason why the Print module uses a path rather than a "class" and "id" as several other modules do, but it's what I have to live with. Using substring might work; I will see if that helps. Nancy ________________________________ From: T L <tloud365@gmail.com> To: development@drupal.org Sent: Mon, July 19, 2010 10:20:26 PM Subject: Re: [development] MySql Performance Problem What about substring() instead of concat() or adding a column for the nid in ppc and pmc? Cheers, tim
Just to let you know, the substring() was slightly slower. Nancy ________________________________ From: T L <tloud365@gmail.com> To: development@drupal.org Sent: Mon, July 19, 2010 10:20:26 PM Subject: Re: [development] MySql Performance Problem What about substring() instead of concat() or adding a column for the nid in ppc and pmc? Cheers, tim
participants (6)
-
Domenic Santangelo -
larry@garfieldtech.com -
Mukesh Agarwal -
nan wich -
T L -
Victor Kane