[drupal-devel] [feature] Alternative caching for high traffic websites

Dries drupal-devel at drupal.org
Wed Mar 23 10:21:30 UTC 2005


Issue status update for http://drupal.org/node/19298

 Project:      Drupal
 Version:      cvs
 Component:    base system
 Category:     feature requests
 Priority:     normal
 Assigned to:  Jeremy at kerneltrap.org
 Reported by:  Jeremy at kerneltrap.org
 Updated by:   Dries
 Status:       patch

Looks like this patch can go in soon after the DRUPAL-4-6 has been
created.  You'll want to update the cache related documentation in
system_help() (then again it might not be necessary).  
I wouldn't bother making this more configurable until we have
evaluated/tested this some more.   I'm very interested in testing this
out or in getting some performance numbers.


Dries



Previous comments:
------------------------------------------------------------------------

March 23, 2005 - 06:32 : Jeremy at kerneltrap.org

Attachment: http://drupal.org/files/issues/database.mysql_1.patch (382 bytes)

Drupal's existing caching mechanism doesn't perform well on highly
dynamic websites in which the cache is flushed frequently.  One example
is a site that is under attack by a spambot that is posting spam
comments every few seconds, causing all cached pages to be flushed
every few seconds.
The attached patches provide the following cache methods:
 1) CACHE_DISABLED:  no caching.  This is unchanged from before.
 2) CACHE_ENABLED_STRICT: This is Drupal's existing caching method, in
which the cache is flushed immediately whenever it contains invalid
data.
 3) CACHE_ENABLED_LOOSE: This is the new caching method (feel free to
suggest a better name, this is the best I could come up with tonight). 
Essentially, it immediately flushes the cache only for specific users
who have modified cached data (whether or not they are logged in),
delaying the flushing of data for other users by several minutes.
This feature requires a new 'cache' field in the sessions table to
track the cache status for each user, whether or not they are logged
in.
A one line change to the system.module updates the cache configuration
settings to include an additional option for loose caching.
The rest of the changes are in bootstrap.inc, as follows:
 - cache_clear_all: if strict caching is enabled, nothing is changed. 
If loose caching is enabled, we delay the flushing of the entire cache
for 5 minutes.  Instead, we update this user's cache field in the
sessions table so we know any cache pages created before the current
time are no longer valid for this user.  We only set the global
"cache_flush" variable if it's not already set.  If the global
"cache_flush" variable was set more than 5 minutes ago, we flush all
cache pages that were expired when the "cache_flush" variable was set.
 - cache_get:  if strict caching is enabled, nothing is changed.  If
loose caching is enabled, we first check the global "cache_flush"
variable and if older than 5 minutes we do a garbage collection call to
cache_clear_all.  Next, we see if there is a valid cache entry for us. 
If the cache entry is older than our session table's "cache" field, we
don't use it.  This will cause the cache to be regenerated by the
current user, and ultimately updated for all users.
Attached is the patch to database.mysql.  To add manually, execute the
following:
ALTER TABLE sessions ADD cache int(11) NOT NULL default '0' AFTER
timestamp;


------------------------------------------------------------------------

March 23, 2005 - 06:33 : Jeremy at kerneltrap.org

Attachment: http://drupal.org/files/issues/system.module_5.patch (1.49 KB)

Attached is the patch for the system.module, adding the new cache
configuration option.


------------------------------------------------------------------------

March 23, 2005 - 06:36 : Jeremy at kerneltrap.org

Attachment: http://drupal.org/files/issues/bootstrap.inc.patch (3.62 KB)

And finally, the patch for bootstrap.inc.
BTW:  Do we want to make the 'cache_flush_delay' configurable?  Or
perhaps just stick with a sane default?  Perhaps 60 seconds is more
sane than 300 seconds?


------------------------------------------------------------------------

March 23, 2005 - 06:44 : Jeremy at kerneltrap.org

Attachment: http://drupal.org/files/issues/bootstrap.inc_0.patch (3.61 KB)

Updated version of the bootstrap.inc.patch, removing an erroneous "else"
that prevented the cache from being properly updated for a specific user
after a delayed flush.


------------------------------------------------------------------------

March 23, 2005 - 09:12 : Anonymous

Looks very good. Please provide patches in one file, though.
Gerhard


------------------------------------------------------------------------

March 23, 2005 - 10:35 : stefan nagtegaal

First of all, I do not have huge sites under my juristriction, so maybe
my points here aren't valid.. If so, tell me and I'll shut up..
I took a look at your patches, and I truly do not understand why you
set more Caching methods in 'admin/system'. I think that it would be
better if we stay with the cuurent options (Enabled, Disabled) and let
Drupal suggest the way of the caching methods (CACHE_ENABLED_LOOSE and
CACHE_ENABLED_SCTRICT). This would make your patch to system.module a
little smaller and easier to understand for newbies..
Also, as you said yourself, I don't see why we should make the
'cache_flush_delay' configurable. I mean, how would you know what a
good time would be? Couldn't drupal find out the right value for that,
so the administrator doesn't have to?
Again, I'm not speaking as someone who is in need of something like
this, but I can understand some people are.. What I'm trying to tell,
is that the idea is probably pretty good only with less options and
more automagical cache settings it would be more userfriendly ad
doesn't scare off newbies (or me)...





More information about the drupal-devel mailing list