[drupal-devel] CCK Node Caching Strategies
Jonathan Chaffer
jchaffer at structureinteractive.com
Sun Mar 6 04:50:25 UTC 2005
The CCK implementation is going to require a resurrection of node
caching, but I'm considering some different approaches. This is an
attempt to document the design process, while at the same time
soliciting suggestions.
For this analysis, assume that each content type has 10 simple text
fields, as well as a multiple-entry text field with an average of 10
entries. We will be looking at the queries necessary to load and
display a listing of 10 of these nodes.
One approach is to attempt to join the tables together, and load all
the fields in one query. This can work in general (Flexinode does
this), but has a few drawbacks. First, the JOIN operation is slow
(though faster than doing the queries separately). Second, there is a
hard limit of 31 joins in a query for MySQL, so this prevents complex
content types from being constructed. Some users have hit this limit
when using Flexinode. Third, the multiple-entry field would have to be
serialized for this to work, which has a significant impact on its
flexibility when performing queries.
1x query to fetch node IDs for the ten nodes (10 results)
10x queries to fetch each node, each with 11 joins (1 result each,
+ deserialize)
The joins can't be performed in the original query, because the content
types could be different in each case.
A simpler implementation is to load each field through a separate
query. This is a very flexible approach, and allows simple
compartmentalization of the query code into the field modules. The
obvious downside is that you need many queries for a simple node load.
1x query to fetch node IDs for the ten nodes (10 results)
100x queries to fetch each normal field (1 result each)
10x queries to fetch the multi-value fields (10 results each)
Next, we can add caching to this, the way Gerhard proposed. This means
that in node_load(), a query is performed to the cache table to load
the serialized node data.
1x query to fetch node IDs for the ten nodes (10 results)
10x queries to fetch each node (1 result each, + deserialize)
I think we can do a bit better, though. By placing the node cache into
a column in the node table, we can grab it at the same time as the node
IDs. Then node_load() would look to see if the cache is present in the
array presented to it, and if so use that rather than perform any
queries at all.
1x query to fetch node IDs and cache for the ten nodes (10 results
+ deserialize)
Note that we can use this method with the cache in the cache table as
well, but that requires a JOIN using a CONCAT() in the ON clause, which
seems risky, and involves direct knowledge of the cache table structure
rather than using the cache API anyway.
Any other ideas?
--
Jonathan Chaffer
Applications Developer, structure:interactive
(616) 364-7423 http://www.structureinteractive.com/
More information about the drupal-devel
mailing list