[development] Caching, caching, caching...

Larry Garfield larry at garfieldtech.com
Sat Jul 22 17:32:43 UTC 2006


On Saturday 22 July 2006 10:30, Dries Buytaert wrote:

> 1. Build a caching algorithm that uses an heuristic to pre-load
> frequently used URL aliases.
>
>     * Advantages: transparent, no configuration required
>
>     * Disadvantages: it a heuristic, we don't know how it would
> perform, it might be tricky to implement, and MySQL does this
> implicitly (but not as aggressive).

I really like LRU conceptually, but I don't know how we'd implement it.  If 
done in the database, we'd have to write the last-access-time back to the 
database each time an alias is accessed, doubling the number of queries 
(unless someone know of a portable update-on-access field in SQL?).  If done 
in the system cache, then we're back to the patch someone already submitted 
(moshe, I think?).  If done in the session, it would be very simple to 
implement but as the same person explained to me when I suggested it to him, 
that could get memory intensive very quickly.

Another possible guideline is "precache anything that's in a menu", as that 
would include primary and secondary links and the majority of always-used 
links, but  probably wouldn't be more than two dozen links on most sites.  
The trick here would be a fast and efficient way of defining "in a menu".

> 2. Provide a textarea that allows administrators to _white_list_ URL
> patterns.

*snip*

> 3. Provide a textarea that allows administrators to _black_list_ URL
> patterns.
>
>     * Advantages: fine-grained control, easy to implement
>
>     * Disadvantages: users usually don't like messing with regular
> expressions, it might take a lot of effort to get the list Just
> Right, and it takes a certain amount of familiarity with Drupal's URL
> scheme (learning curve for new Drupal users).  The behavior might be
> confusing: you add an alias, and it doesn't work because you forgot
> about the list of URL patterns.

Why would you need a textarea and regexes?  Just add a "pre-cache" checkbox to 
the edit-alias screen.  Then the first time the alias lookup is called, it 
does a quick "SELECT ... FROM ... WHERE precache=1".  That gets you what the 
admin thinks is the most common aliases, and both the UI and code couldn't 
get any simpler.

Disadvantage: That's assuming the admin has any idea what the most common 
aliases are. :-)

On the subject of black-listing, though, does anyone ever alias a path that's 
under admin/?  The biggest drain from the aliasing now that I see is all of 
the queries to look up paths that aren't aliased in the first place.

> 4. Stop doing SQL queries when you cached all possible URL aliases.
>
>     * Advantages: transparent, no configuration required, can co-
> exist with (1), (2), (3) and (5).
>
>     * Disadvantages: only works for a subset of all Drupal sites, not
> a solution for larger Drupal sites.

Also doesn't take into account the order that the page is built.  If you only 
have 5 aliases, but they're all primary links, those are built rather late (I 
think?).  So the system wouldn't finish loading all aliases until it was 
nearly done with the page anyway.

> 5. Improve Drupal's high-level page caching so we have to rebuild
> pages less frequently.
>
>     * Advantages: no configuration required, can co-exist with (1),
> (2), (3) and (4), eliminates many more SQL queries.
>
>     * Disadvantages: doesn't work for authenticated users.

I'll leave this one to the cache experts.

-- 
Larry Garfield			AIM: LOLG42
larry at garfieldtech.com		ICQ: 6817012

"If nature has made any one thing less susceptible than all others of 
exclusive property, it is the action of the thinking power called an idea, 
which an individual may exclusively possess as long as he keeps it to 
himself; but the moment it is divulged, it forces itself into the possession 
of every one, and the receiver cannot dispossess himself of it."  -- Thomas 
Jefferson


More information about the development mailing list