On Thu, 26 May 2005, Dries Buytaert wrote:
november last year, I profiled drupal.org's cache observations. Yesterday, Moshe asked to profile it again so we could evaluate the usefulness of Jeremy's "loose caching" mechanism.
The past 20 hours, I logged 93.000 unique page requests using the patch at http://buytaert.net/temporary/cache-statistics.patch. Loose caching was enabled.
Results:
1. Last year we found that authenticated users were responsible for 15,8 % of all page views. A year later, we see that authenticated users are responsible for 14,9% of all page views.
2. Last year we found that only 27.9% of the page requests actually benefited from the cache system. That is, for more than 2/3th of the page requests, we had to generate a page dynamically. A year later, using "loose caching" rather than "strict caching", we see that 30,7% of the page requests benefit from the cache system. Read: we still have a lot of page cache misses! :(
3. Last year, we found that the cache got flushed once every 207 page requests. A year later, we observe that the cache got flushed once every 190 page requests.
We conclude that:
1. Loose caching does not significantly -- or not necessarily -- improve the behavior of drupal.org's page cache (though I'd like to believe that it does when there are sudden traffic spikes/bursts).
2. When writing code, we can NOT assume that a page will benefit from being cached.
My conclusion is that our cache needs to be more finely grained (right, that's not new). For example: Comment.module calls cache_clear_all() after a comment has been added. Granted, thanks to our block system it can happen that some comment (or comment count or "last updated" ...) is displayed on just any page. But I'd prefer modules that provide such blocks to build their own block cache and invalidate cached pages according to the blocks' path settings. That should not be too difficult. The same should be done for blocks that display content based on newly created nodes. Those modules should take the cache setting (none, loose, strict) into account. For comment.module that would mean that it only sets variables (comment_last_timestamp, comment_last_uid, comment_last_nid, comment_last_cid) to new values instead of invalidating the cache. forum.module would then check the variable against its own variable (forum_last_time_page_rebuilt) and invalidate pages as appropriate. Same for tracker and other modules that deal with comments in one way or the other. Generally, we should investigate if a page cache makes sense for a community site at all. For Drupal.org it might make more sense to build up a page from pre-cached pieces of content (nodes, blocks, ...) than to just deliver a complete page from the cache. There is just too much new content added for a global cache to be usefull. Cheers, Gerhard