[drupal-devel] Drupal.org cache statistics

Nicholas Ivy nji at njivy.org
Thu May 26 23:42:22 UTC 2005


Caching idea #4:

Use feedback to control the maximum number of non-cached pages per 
minute by adjusting the cache delay in idea #2 below.  Site 
administrators could define the maximum, then when site load passes 
that threshold the cache increases its delay until the number of 
rendered pages is no more than the maximum setting.  Of course, when 
the load is low, act normally -- don't try to increase the load. ;-)

Nic

On May 26, 2005, at 6:06 PM, Nicholas Ivy wrote:

> I haven't heard any new caching ideas for a while, so I sat down and 
> brainstormed a bit.  I began by re-phrasing Dries's statistics:
>
> Assuming that caching applies only to anonymous visitors for the sake 
> of approximation, these numbers tell me that 35% of all anonymous 
> requests (30 / 85) are identical to one of the last 170 anonymous 
> requests (85% of 200).  Then, every 2.5 minutes or so (200 pages / 
> 93,000 pages * 20 hours), the entire cache is reset.  But roughly 65% 
> of the anonymous requests (55 / 85) are unique within a 2.5-minute 
> span, which means that at least 65% of drupal.org's pages wait more 
> than 2.5 minutes between anonymous page requests.
>
> So the goal is obviously to extend the lifetime of the cache for 
> anonymous visitors.  Wiping the entire cache seems overkill, but what 
> else can be done when every page may contain dynamic information?  
> Other people have already said this already.
>
> So what options do we have to improve caching performance?
>
> 1)  Perhaps we could use statistics, like a Poisson distribution, to 
> predict how much time is likely to occur between requests to a certain 
> page given the activity we recently recorded [1].  If the predicted 
> wait is greater than 2.5 minutes, don't immediately clear the cache 
> for that page.  Instead, wait until there's at least a 50% chance 
> someone will look again.  After all, it's likely that no one is 
> looking at it, so why keep the page up-to-date?
>
> 2)  Wait more than 2.5 minutes before clearing the cache for anonymous 
> users.  If most of the visitors have authenticated, this scheme won't 
> help much.
>
> 3)  Cache page elements as other people have suggested.  Assuming most 
> of the effort to create a page is spent rendering page elements, this 
> scheme should work too.
>
> Feel free to point out mistakes!
>
> Nic
>
> ----
> [1] Using a Poisson distribution, there is a 50% chance of waiting 
> less than t seconds between requests, where t = -1 * 
> number_of_seconds_in_sample_period / 
> number_of_requests_in_sample_period * log(.5).




More information about the drupal-devel mailing list