[documentation] Drupal guide to caching

Steve Dondley sdondley at gmail.com
Sun Mar 30 01:33:12 UTC 2008


I'm writing a drupal guide to caching. If you know something about
caching and are so inclined, please add your two cents about what I
have so far:

THE DRUPAL 6 GUIDE TO CACHING

This guide is an introduction to the various caching mechanims Drupal
uses to speed the site. It's primarily designed for developers but a
lot of it should be understandable by Drupal site administrators who
are familiar with the basics of how Drupal delivers content.

The Basics
Caches are used to improve the performance of your Drupal site. Rather
than extracting the same data over and over again every time a page is
loaded, caching stores frequently accessed and relatively static data
in a convenient place and format.

Caching has a drawback in that it can lead to "stale" data. This means
that the website outputs old data or content from the cache even
though newer stuff exists somewhere else. This problem can be
particularly troublesome for developers who can get confused as to why
changes they expect to see happen aren't. Hopefully, by reading this
document, you'll have a more pleasant and less confusing Drupal
experience.

What gets cached, where it gets cached, and how
There are two different ways Drupal stores cached data:

1) Using files
Drupal can consolidate all the css files your site delivers on each
page load and place them into a fewer number of files. The resulting
files are also compressed. This is important for Drupal sites where
it's not unusual to have a dozen or more stylesheets associated with
each page, depending on how many modules are enabled. Having so many
stylesheets will increase page load times because the browser has to
make several round trips to the server to download all the stylesheet
files. By using the css caching feature, you can consolidate these
files into fewer larger files and decrease page load time
significantly.

Just like style sheets, javascript files can also be consolidated.
However, these files are not compressed.

How to turn on stylesheet and javascript file caching
======================================================
First, be sure you have your "Download method" is set to "Public" (set
at admin/settings/file-system). You will not be able to use stylesheet
and javascript caching when your "Download method" is set to
"Private."

Next, in the navigation menu, select "Administer -> Site Configuration
-> Performance" (admin/settings/performance) and scroll down to the
"Bandwidth optimizations" fieldset and check off "Enabled" for both
CSS files and Javascript files.

Note that turning on stylesheet and javascript caching can interfere
with theme development and should only be enabled in a production
environment.

2) In your database
The main location where Drupal stores cached data is in special tables
in your database. Drupal core sets up seven tables for caching data.
Other modules will add additional cache tables to the database as
needed. The seven tables set up by core are:

cache
An "all purpose" table that can be shared by various modules. This
table is designed for modules that need to store only a few rows of
data. Drupal core uses this table to store the following data:

* Variable data. These are variables that are set with the
variable_set() function and retrieved with the variable_get()
function. When this cache goes stale, it is refereshed by data from
the variable table. A call to the variable_set function will trigger a
cache refresh.

* Theme registry data. This registry is a listing of all the themes
that can be overridden by theme developers. The existance of the theme
registry makes it easy to overide themes by allowing a user to simply
drop a tpl.php file in the theme's directory.  When this cache goes
stale, it's regenerated from the function definitions contained in
module and theme files.

* Schema data. This data contains information about the table
structure of the database. When this cache goes stale, it's
regenerated from ???. A visit to the admin/build/modules page will
trigger a cache refresh.

cache_block
A table for storing content generated by your blocks. This saves
Drupal from having to repeatedly query the database for unchanged
block content. When this cache goes stale, it is refreshed by data
from the boxes table where block content is stored. Any update to a
block's content will trigger a cache refresh for that block. The
entire cache is refreshed when a node, comment, user, or taxonomy term
is added or updated. Module developers have the option of gaining more
control over when a particular block's cache is refreshed using cache
granularity settings for their blocks. Refer to the constants defined
at the top of the block.module for further details.

The block cache can be turned on an off at "Administer -> Site
Configuration -> Performance" (admin/settings/performance).

Note that block caching is inactive when modules defining content
access restrictions are enabled. For example, if organic groups,
content access, taxonomy access modules or other modules that restrict
access to certain kinds of content are turned on, block caching will
be turned off.

cache_filter
A table for storing filtered pieces of content. This saves Drupal from
having to run the same expensive regular expression operations on
unchanged content that gets run through the input filters. When this
cache goes stale, it is refreshed by the data in the node_revisions
table which contains node content. Cron jobs, updates to nodes, and
updates to filter formats will trigger cache refreshes.

Question: What is the thinking having the cache_filter refereshed on cron jobs?

AUTHOR'S NOTE: The items below have to be researched more to determine
what trigges them to refresh and are unfinished.

cache_form
A table for storing forms generated by the forms api. This saves
Drupal from having to rebuild a unchanged forms. When this cache goes
stale, it is refreshed by output from the form module.

cache_menu
A table for storing the menu items and menu item hierarchies. This
saves Drupal from having to regenerate the data structures needed to
define the menu items and their hiearchies each page load. When the
data goes stale, the cache is refreshed from the data contained in the
menu table.

cache_page
A table for storing pages for anonymous users. This saves Drupal from
making dozens or even hundreds of expensive queries needed to generate
a page. When the cache for a particular page goes stale, it gets
refreshed by the html output for that page. This cache can be turned
on an off under "Administer -> Site Configuration -> Performance"
(admin/settings/performance).

cache_update
A table used to store information about installed modules and themes.
This saves Drupal from having to perform two very expensive operations
for listing the installed modules and themes and the status of these
modules and releases compared to what's available for download on
drupal.org. When this cache goes stale, it is refresed from the data
in the system table.

-- 
Prometheus Labor Communications, Inc.
http://prometheuslabor.com
413-572-1300

Communicate or Die: American Labor Unions and the Internet
http://communicateordie.com


More information about the documentation mailing list