MYSQL queries and Modules
Hi @all is there a way to improve mysql queries from several modules? I've loaded aprox. 40 modules and it slows the site terrible... There is also the thing that if you load a site as an anonymous drupal does also the mysql queries for logged in users. Does anybody have such problems? If yes, may we can change experiences. Thanks for help regards Andreas -- ========================================================================= _____________ / ___________/ Andreas Laesser / //_// /____/ Signal Proc.& Speech Communication Lab. __/ /___/ / __ Graz University of Technology /___//____//__ Inffeldgasse 12 | A-8010 Graz | Austria http://www.spsc.tugraz.at Tel: +43 (0)316 873 -4443 Fax: DW 104439 =========================================================================
Le vendredi 25 février 2011 à 10:46 +0100, Andreas Laesser a écrit :
Hi @all
is there a way to improve mysql queries from several modules? I've loaded aprox. 40 modules and it slows the site terrible...
There is also the thing that if you load a site as an anonymous drupal does also the mysql queries for logged in users.
Does anybody have such problems? If yes, may we can change experiences.
Thanks for help regards Andreas
The "dynamic" oriented approche makes most of the developers unable to tell when they should or should not make queries. The cache oriented API (i.e. using cache_get() and cache_set() everywhere) is often wrong designed. cache_get() and cache_set() used several time on small caches will generate a lot of queries where the real caching points should be something like only entry and output points of APIs. Sometime, recalculate a small amount of data is faster than fetching the result from a cache (it needs profiling to know exactly when and where). To reduce queries, use modules that relies on field as often as you can, because CCK will cache full node objects with fields content therefore will reduce greatly number of queries. Do profiling, remove modules that does those queries. If you use Views a lot, it worth the shot switching to node display mode, using build modes, it will make CCK cache everything (field content etc) instead of using custom view based display mode that could (in certain use cases only) generate a lot more queries. Avoid node messing modules that uses a lot hook_nodeapi() because those calls will never be cached anywhere, since CCK does only cache field content and such. When you get using more than ~30 modules, you may have to refactor your site and write some bits of custom glue instead of using too generic modules that attempt to do all consistency checks for you, you will gain a lot in term of performances. Logged in users are always a problem and there is no magical solution. Avoid modules that uses hook_init(), they will also slow the bootstrap process therefore create a constant bottleneck in the bootstrap itself, even for non logged users and 404/403 hits. Pierre.
When I develop modules, I have Devel turned on. One main thing I look for is repeated queries. Unfortunately, I find a lot of them, especially in other people's modules. I'm glad to see global caching available in D7 so that sub-functions don't have to repeat information gathering that another function has already done. As a teacher once explained to me, "All generalities are false, including this one." Some of your statements are decent guidleines, but hardly universally true. For example, "avoid ... hook_init" -- sometimes it is necessary, so one must be discerning in applying this rule. Indeed using cache is a balancing act. When I added caching to one of my modules, I checked its performance impact. Using the main cache table made things worse because cache_get queries were taking longer than no caching at all. So I created my own cache table and it solved the performance issue that I was attempting to fix. I still don't think it was a good substitue for a properly tuned database server, but few people have those. Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr. ________________________________ From: Pierre Rineau (in response to Andreas Laesser)
is there a way to improve mysql queries from several modules? I've loaded aprox. 40 modules and it slows the site terrible...
The "dynamic" oriented approche makes most of the developers unable to tell when they should or should not make queries. The cache oriented API (i.e. using cache_get() and cache_set() everywhere) is often wrong designed. cache_get() and cache_set() used several time on small caches will generate a lot of queries where the real caching points should be something like only entry and output points of APIs. Sometime, recalculate a small amount of data is faster than fetching the result from a cache (it needs profiling to know exactly when and where). To reduce queries, use modules that relies on field as often as you can, because CCK will cache full node objects with fields content therefore will reduce greatly number of queries. Do profiling, remove modules that does those queries. If you use Views a lot, it worth the shot switching to node display mode, using build modes, it will make CCK cache everything (field content etc) instead of using custom view based display mode that could (in certain use cases only) generate a lot more queries. Avoid node messing modules that uses a lot hook_nodeapi() because those calls will never be cached anywhere, since CCK does only cache field content and such. When you get using more than ~30 modules, you may have to refactor your site and write some bits of custom glue instead of using too generic modules that attempt to do all consistency checks for you, you will gain a lot in term of performances. Logged in users are always a problem and there is no magical solution. Avoid modules that uses hook_init(), they will also slow the bootstrap process therefore create a constant bottleneck in the bootstrap itself, even for non logged users and 404/403 hits.
On Sat, 2011-02-26 at 09:22 -0800, nan wich wrote:
When I develop modules, I have Devel turned on. One main thing I look for is repeated queries. Unfortunately, I find a lot of them, especially in other people's modules. I'm glad to see global caching available in D7 so that sub-functions don't have to repeat information gathering that another function has already done.
If you mean drupal_static() then you should not say "global caching" but "registry pattern", it avoids confusion on the implemented pattern and open it to a better understanding and probably a better usage.
As a teacher once explained to me, "All generalities are false, including this one." Some of your statements are decent guidleines, but hardly universally true. For example, "avoid ... hook_init" -- sometimes it is necessary, so one must be discerning in applying this rule.
Sometimes, but when you are writing a hook_init(), in 90% of the cases you are wrong. This hook is run even when generating image (cache|style) derivative, which means that in case of a 404 error, you run a whole bunch of totally unnecessary and/or really costly operations.
Indeed using cache is a balancing act. When I added caching to one of my modules, I checked its performance impact. Using the main cache table made things worse because cache_get queries were taking longer than no caching at all. So I created my own cache table and it solved the performance issue that I was attempting to fix. I still don't think it was a good substitue for a properly tuned database server, but few people have those.
Using a different cache because it increases performances is not always the good way to go. If fetching a cache in global cache table is slower than rebuilding your data, then you should probably *always* rebuild it. Because if you are slower in the cache table than in your own doesn't mean you solved the problem by doing the custom table, it only means that you delayed its negative effects because as soon as the site will scale in term of data volumetry, it will then happen again. By using this kind of judgement instead of separating logically caches (critical/always used cache such as bootstrap and menu, or heavy and ponctual caches such as aggressive page cache) you break the sysadmin's work which is to distribute those bins over different cache backends (memcache, apc, xcache, database, ..) depending on the data volumetry and physical environment measured performance impact.
Nancy
Pierre.
Yes, I meant drupal_static(); I just couldn't think of the function when I wrote that (I haven't done that much D7 coding yet). Ah, see now you say that 10% of hook_init's are appropriate, so telling someone to always avoid it is bad advice. In all the coding that I have done, I have used hook_init() maybe three times (twice I know for sure). And one was largely a toss up between hook_init() and hook_boot() and I can guarantee was needed. And you probably add hook_exit() to the list of questionable usage. In the case where I created my own cache file, it was more an attempt to "help" the database keep the data available because rebuilding the data (taxonomy-based) could get quite heavy in certain circumstances. When I convert the module to D7, I may not need any of that because field data may already be cached any way. As a matter of fact, 90% of the module may no longer be needed. Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr. ________________________________ From: Pierre Rineau If you mean drupal_static() Sometimes, but when you are writing a hook_init(), in 90% of the cases you are wrong.
Indeed using cache is a balancing act. When I added caching to one of my modules, I checked its performance impact. Using the main cache table made things worse because cache_get queries were taking longer than no caching at all. So I created my own cache table and it solved the performance issue that I was attempting to fix. I still don't think it was a good substitue for a properly tuned database server, but few people have those.
Using a different cache because it increases performances is not always the good way to go. If fetching a cache in global cache table is slower than rebuilding your data, then you should probably *always* rebuild it. Because if you are slower in the cache table than in your own doesn't mean you solved the problem by doing the custom table, it only means that you delayed its negative effects because as soon as the site will scale in term of data volumetry, it will then happen again. By using this kind of judgement instead of separating logically caches (critical/always used cache such as bootstrap and menu, or heavy and ponctual caches such as aggressive page cache) you break the sysadmin's work which is to distribute those bins over different cache backends (memcache, apc, xcache, database, ..) depending on the data volumetry and physical environment measured performance impact.
Nancy
Pierre.
On Sun, 2011-02-27 at 11:31 -0800, nan wich wrote:
Ah, see now you say that 10% of hook_init's are appropriate, so telling someone to always avoid it is bad advice. In all the coding that I have done, I have used hook_init() maybe three times (twice I know for sure). And one was largely a toss up between hook_init() and hook_boot() and I can guarantee was needed. And you probably add hook_exit() to the list of questionable usage.
Don't make me say something I didn't mean. What I really meant is that hook_init() should never be used, some modules legally uses it as a way to set up some context related information because no other helpers exists in core (that's actually part of the butler project on g.d.o). But aside of that, all hook_init() implementations should be banned. In fact even existing one (I'm thinking about D6 OG module) can show themselves being incredibly slow under certain circumstances.
In the case where I created my own cache file, it was more an attempt to "help" the database keep the data available because rebuilding the data (taxonomy-based) could get quite heavy in certain circumstances. When I convert the module to D7, I may not need any of that because field data may already be cached any way. As a matter of fact, 90% of the module may no longer be needed.
If you choose to separate your own cache for logic and business reasons, this is totally valid. For actual D7 core implementation, I'm not sure fields are cached, that's something that the entity API attempt to resolve. They didn't figure out a nice way to do it in core before the D7 release. Pierre.
Pierre Rineau wrote:
On Sun, 2011-02-27 at 11:31 -0800, nan wich wrote:
Ah, see now you say that 10% of hook_init's are appropriate, so telling someone to always avoid it is bad advice. In all the coding that I have done, I have used hook_init() maybe three times (twice I know for sure). And one was largely a toss up between hook_init() and hook_boot() and I can guarantee was needed. And you probably add hook_exit() to the list of questionable usage.
Don't make me say something I didn't mean. What I really meant is that hook_init() should never be used, some modules legally uses it as a way to set up some context related information because no other helpers exists in core (that's actually part of the butler project on g.d.o). But aside of that, all hook_init() implementations should be banned. In fact even existing one (I'm thinking about D6 OG module) can show themselves being incredibly slow under certain circumstances.
I'm going to agree with Nan Wich and say that your explicit "never use" statement is overboard. If used properly hook_init is a must. However, I do understand that there is improper use of the hook and grabbing a bunch of data from the DB should never be one. -- Earnie -- http://progw.com -- http://www.for-my-kids.com
participants (4)
-
Andreas Laesser -
Earnie Boyd -
nan wich -
Pierre Rineau