[documentation] Drupal guide to caching

Larry Garfield larry at garfieldtech.com
Tue Apr 1 04:50:39 UTC 2008


By static caching, do you mean things like this:

function get_stuff($key) {
  static $stuff;

  if (empty($stuff[$key])) {
    $stuff[$key] = db_query(...);
  }

  return $stuff[$key];
}

The primary use for that is to reduce repetitive database hits, for which it 
works very well with very low overhead.  It's not that great for anything 
more complex than that, but I use it frequently because the extra bytes of 
RAM cost less than the extra database traffic.

On Monday 31 March 2008, Nancy Wichmann wrote:
> I'd like to see some discussion on "static" caching, which is done many
> places in Drupal (e.g. taxonomy.module, path.inc).  From the tests that I
> have done with it, static caching seems to be pretty worthless because it
> doesn't last long enough to do much good.
>
> And, frankly, I'm not even sure that the cache tables are that beneficial
> either.  I've been doing some performance work on one of my modules (still
> on D5) and find that the slowest queries in my system are to the cache
> tables.  It would take a considerable amount of rebuild to offset the
> difference, and most of the uses I've seen aren't that complex (yes, there
> are some).
>
> Nancy E. Wichmann, PMP
>
> -----Original Message-----
> From: documentation-bounces at drupal.org
> [mailto:documentation-bounces at drupal.org]On Behalf Of Steve Dondley
> Sent: Sunday, March 30, 2008 11:32 PM
> To: documentation at drupal.org
> Subject: Re: [documentation] Drupal guide to caching
>
> Here's a revised version with feedback given to date and some of my
> own improvements. I left out the last 4 kinds of caching because I
> have not chnaged those yet.
>
>
> This guide is an introduction to the various caching mechanims Drupal
> uses to speed the site. It's primarily designed for developers but a
> lot of it should be understandable by Drupal site administrators who
> are familiar with the fundamentals of how Drupal delivers content.
>
> The Basics
> Caches are used to improve the performance of your Drupal site by
> taking a snapshot of frequently accessed, relatively static and/or
> expensive-to-process data and copying it into a location and format so
> that it's much faster to retrieve the next time it's needed.
> Periodically, cached data must be deleted and updated with the most
> recent version of the data. Otherwise, the cache risks going "stale"
> and the Drupal website will output old data even though newer data
> exists.
>
> The problem of stale data can be particularly troublesome for new
> developers who can get confused as to why changes they make to
> Drupal's code don't seem to have any affect. For example, a developer
> might make a change to the $items array in the hook_menu function. But
> because the $items array is cached, the developer will wonder be left
> scratching their head as to why their change isn't reflected in the
> website's output on the next page load. As most experienced
> programmers can attest, this kind of confusion leads to wild goose
> chases hunting for non-existant problems. Hopefully, by reading this
> document, you'll have a more pleasant Drupal development experience.
>
> What gets cached, where it gets cached, and how
> There are two different kinds of caching that take place in Drupal:
> file-based and database-based caching.
>
> File-based caching
> The best example of file-based caching is the consolidation of all the
> stylesheet and javascript files your site delivers into just a few
> files. This is particularly useful for Drupal sites where, depending
> on which modules are enabled, it's not unusual to have a dozen or more
> files to deliver the javascript and style sheet data associated with
> each html page. Having so many files to download increases page load
> times because the browser has to make several round trips to the
> server to download them all. By reducing the number of files to
> download, the file caching feature will cut page loading time
> significantly.
>
> The style sheet and javascript cache files are stored in the in the
> respective "css" and "js" directories within the file system directory
> ("files" by default) as set on the file system settings page. Style
> sheet files are compressed by removing whitespace and linebreaks to
> further increase download speed. Javascript files are not compressed
> because this can introduce bugs into the code.
>
> How to turn on stylesheet and javascript file caching
> ======================================================
> First, be sure you have your "Download method" is set to "Public" (set
> at admin/settings/file-system). You will not be able to use stylesheet
> and javascript caching when your "Download method" is set to
> "Private." This is because with the private download method, Drupal
> will deny access to the stylesheet and javascript files in your files
> directory. *I am not sure about the accuracy of the last sentence*
>
> Next, in the navigation menu, select "Administer -> Site Configuration
> -> Performance" (admin/settings/performance), scroll down to the
> "Bandwidth optimizations" fieldset, check off "Enabled" for the kinds
> of files you wish to cache and click "Save configuration".
>
> Note that turning on stylesheet and javascript caching can interfere
> with theme development and should only be enabled in a production
> environment. If you need to make a change to a cached stylesheet or
> javascript file, disable the file caching feature and reenable it
> after you make your change. Alternatively, you can delete the cached
> files and they will be regenerated for you on the next page load.
>
> Finally, be aware that stylesheet and javascript caching can cause
> problems on sites that are run in a load-balance environment (across
> two or more servers). This is because the cached files may be stored
> on one server and not the other.
>
> Database-based caching
> Drupal can also store cached data in special tables in your database.
> Drupal core sets up seven tables for this purpose. Although these
> tables share the same table structure and it would be possible to
> combine them into a single table, they are split up because smaller
> tables improve cache peformance. If needed, other modules can add
> additional cache tables to the database. The seven tables set up by
> core are:
>
> cache
> An "all purpose" table that can be shared by modules for storing a few
> rows of cached data. Drupal core uses this table to store the
> following data:
>
> * Variable data. These are variables that are set with the
> variable_set() function and retrieved with the variable_get()
> function. This cache saves Drupal from making invidual queries for
> each variable_get() call. When this cache goes stale, it is refereshed
> by data from the variable table. A call to the variable_set function
> will trigger a cache refresh.
>
> * Theme registry data. This registry is a listing of all the themes
> that can be overridden by theme developers. The existance of the theme
> registry makes it easy to overide themes by allowing a user to simply
> drop a tpl.php file in the theme's directory.  When this cache goes
> stale, it's regenerated by calling hook_theme() and doing a file
> system scan for template files to find additional theme functions.
>
> * Schema data. This data contains information about the structure of
> tables in the database. Caching the schema data saves Drupal from
> having to reload infrequently changing database schema information
> from .install files everytime it's needed. When this cache goes stale,
> it's regenerated by calling hook_schema() which collects schema data
> from each module's .install file. A visit to the admin/build/modules
> page, enabling or disabling a module, or updating module through the
> update script will trigger a cache refresh.
>
> cache_block
> A table for storing content generated by your blocks. This saves
> Drupal from having to repeatedly query the database for unchanged data
> related to blocks. When this cache goes stale, it is refreshed by data
> from the boxes table where block content is stored and the blocks
> table which stores block configuration data. Any update to a block's
> content or configuration will trigger a cache refresh for that block.
> The entire cache is refreshed when a node, comment, user, or taxonomy
> term is added or updated. Module developers have the option of gaining
> more control over when a particular block's cache is refreshed using
> cache granularity settings for their blocks. Refer to the constants
> defined at the top of the block.module for further details.
>
> The block cache can be turned on an off at "Administer -> Site
> Configuration -> Performance" (admin/settings/performance).
>
> Note that block caching is inactive when modules defining content
> access restrictions are enabled. For example, if organic groups,
> content access, taxonomy access modules or other modules that restrict
> access to certain kinds of content are turned on, block caching will
> be turned off.
>
> cache_filter
> A table for storing filtered pieces of content. This saves Drupal from
> having to run the same expensive regular expression operations on
> unchanged content that gets run through the input filters. When this
> cache goes stale, it is refreshed by the content data (e.g. content in
> the node_revisions table, blocks table, etc.). Cron jobs, updates to
> nodes, and updates to filter formats will trigger cache refreshes.
>
> Question: What is the thinking having the cache_filter refereshed on cron
> jobs?
>
> On Sat, Mar 29, 2008 at 9:33 PM, Steve Dondley <sdondley at gmail.com> wrote:
> > I'm writing a drupal guide to caching. If you know something about
> >  caching and are so inclined, please add your two cents about what I
> >  have so far:
> >
> >  THE DRUPAL 6 GUIDE TO CACHING
> >
> >  This guide is an introduction to the various caching mechanims Drupal
> >  uses to speed the site. It's primarily designed for developers but a
> >  lot of it should be understandable by Drupal site administrators who
> >  are familiar with the basics of how Drupal delivers content.
> >
> >  The Basics
> >  Caches are used to improve the performance of your Drupal site. Rather
> >  than extracting the same data over and over again every time a page is
> >  loaded, caching stores frequently accessed and relatively static data
> >  in a convenient place and format.
> >
> >  Caching has a drawback in that it can lead to "stale" data. This means
> >  that the website outputs old data or content from the cache even
> >  though newer stuff exists somewhere else. This problem can be
> >  particularly troublesome for developers who can get confused as to why
> >  changes they expect to see happen aren't. Hopefully, by reading this
> >  document, you'll have a more pleasant and less confusing Drupal
> >  experience.
> >
> >  What gets cached, where it gets cached, and how
> >  There are two different ways Drupal stores cached data:
> >
> >  1) Using files
> >  Drupal can consolidate all the css files your site delivers on each
> >  page load and place them into a fewer number of files. The resulting
> >  files are also compressed. This is important for Drupal sites where
> >  it's not unusual to have a dozen or more stylesheets associated with
> >  each page, depending on how many modules are enabled. Having so many
> >  stylesheets will increase page load times because the browser has to
> >  make several round trips to the server to download all the stylesheet
> >  files. By using the css caching feature, you can consolidate these
> >  files into fewer larger files and decrease page load time
> >  significantly.
> >
> >  Just like style sheets, javascript files can also be consolidated.
> >  However, these files are not compressed.
> >
> >  How to turn on stylesheet and javascript file caching
> >  ======================================================
> >  First, be sure you have your "Download method" is set to "Public" (set
> >  at admin/settings/file-system). You will not be able to use stylesheet
> >  and javascript caching when your "Download method" is set to
> >  "Private."
> >
> >  Next, in the navigation menu, select "Administer -> Site Configuration
> >  -> Performance" (admin/settings/performance) and scroll down to the
> >  "Bandwidth optimizations" fieldset and check off "Enabled" for both
> >  CSS files and Javascript files.
> >
> >  Note that turning on stylesheet and javascript caching can interfere
> >  with theme development and should only be enabled in a production
> >  environment.
> >
> >  2) In your database
> >  The main location where Drupal stores cached data is in special tables
> >  in your database. Drupal core sets up seven tables for caching data.
> >  Other modules will add additional cache tables to the database as
> >  needed. The seven tables set up by core are:
> >
> >  cache
> >  An "all purpose" table that can be shared by various modules. This
> >  table is designed for modules that need to store only a few rows of
> >  data. Drupal core uses this table to store the following data:
> >
> >  * Variable data. These are variables that are set with the
> >  variable_set() function and retrieved with the variable_get()
> >  function. When this cache goes stale, it is refereshed by data from
> >  the variable table. A call to the variable_set function will trigger a
> >  cache refresh.
> >
> >  * Theme registry data. This registry is a listing of all the themes
> >  that can be overridden by theme developers. The existance of the theme
> >  registry makes it easy to overide themes by allowing a user to simply
> >  drop a tpl.php file in the theme's directory.  When this cache goes
> >  stale, it's regenerated from the function definitions contained in
> >  module and theme files.
> >
> >  * Schema data. This data contains information about the table
> >  structure of the database. When this cache goes stale, it's
> >  regenerated from ???. A visit to the admin/build/modules page will
> >  trigger a cache refresh.
> >
> >  cache_block
> >  A table for storing content generated by your blocks. This saves
> >  Drupal from having to repeatedly query the database for unchanged
> >  block content. When this cache goes stale, it is refreshed by data
> >  from the boxes table where block content is stored. Any update to a
> >  block's content will trigger a cache refresh for that block. The
> >  entire cache is refreshed when a node, comment, user, or taxonomy term
> >  is added or updated. Module developers have the option of gaining more
> >  control over when a particular block's cache is refreshed using cache
> >  granularity settings for their blocks. Refer to the constants defined
> >  at the top of the block.module for further details.
> >
> >  The block cache can be turned on an off at "Administer -> Site
> >  Configuration -> Performance" (admin/settings/performance).
> >
> >  Note that block caching is inactive when modules defining content
> >  access restrictions are enabled. For example, if organic groups,
> >  content access, taxonomy access modules or other modules that restrict
> >  access to certain kinds of content are turned on, block caching will
> >  be turned off.
> >
> >  cache_filter
> >  A table for storing filtered pieces of content. This saves Drupal from
> >  having to run the same expensive regular expression operations on
> >  unchanged content that gets run through the input filters. When this
> >  cache goes stale, it is refreshed by the data in the node_revisions
> >  table which contains node content. Cron jobs, updates to nodes, and
> >  updates to filter formats will trigger cache refreshes.
> >
> >  Question: What is the thinking having the cache_filter refereshed on
> > cron
>
> jobs?
>
> >  AUTHOR'S NOTE: The items below have to be researched more to determine
> >  what trigges them to refresh and are unfinished.
> >
> >  cache_form
> >  A table for storing forms generated by the forms api. This saves
> >  Drupal from having to rebuild a unchanged forms. When this cache goes
> >  stale, it is refreshed by output from the form module.
> >
> >  cache_menu
> >  A table for storing the menu items and menu item hierarchies. This
> >  saves Drupal from having to regenerate the data structures needed to
> >  define the menu items and their hiearchies each page load. When the
> >  data goes stale, the cache is refreshed from the data contained in the
> >  menu table.
> >
> >  cache_page
> >  A table for storing pages for anonymous users. This saves Drupal from
> >  making dozens or even hundreds of expensive queries needed to generate
> >  a page. When the cache for a particular page goes stale, it gets
> >  refreshed by the html output for that page. This cache can be turned
> >  on an off under "Administer -> Site Configuration -> Performance"
> >  (admin/settings/performance).
> >
> >  cache_update
> >  A table used to store information about installed modules and themes.
> >  This saves Drupal from having to perform two very expensive operations
> >  for listing the installed modules and themes and the status of these
> >  modules and releases compared to what's available for download on
> >  drupal.org. When this cache goes stale, it is refresed from the data
> >  in the system table.
> >
> >  --
> >  Prometheus Labor Communications, Inc.
> >  http://prometheuslabor.com
> >  413-572-1300
> >
> >  Communicate or Die: American Labor Unions and the Internet
> >  http://communicateordie.com
>
> --
> Prometheus Labor Communications, Inc.
> http://prometheuslabor.com
> 413-572-1300
>
> Communicate or Die: American Labor Unions and the Internet
> http://communicateordie.com
> --
> Pending work: http://drupal.org/project/issues/documentation/
> List archives: http://lists.drupal.org/pipermail/documentation/
>
> --
> Pending work: http://drupal.org/project/issues/documentation/
> List archives: http://lists.drupal.org/pipermail/documentation/


-- 
Larry Garfield			AIM: LOLG42
larry at garfieldtech.com		ICQ: 6817012

"If nature has made any one thing less susceptible than all others of 
exclusive property, it is the action of the thinking power called an idea, 
which an individual may exclusively possess as long as he keeps it to 
himself; but the moment it is divulged, it forces itself into the possession 
of every one, and the receiver cannot dispossess himself of it."  -- Thomas 
Jefferson


More information about the documentation mailing list