On Mon, 2007-01-22 at 20:48 -0600, Larry Garfield wrote:
And right there you lose the ability to treat a node as "just a node", because the node data is so spread out. You can't easily "find all nodes created after time X", because there's n different tables you have to search.
I think I would tend to disagree. Yes, you need to load from the database a definition of each node type. However, you only have to do that once per node type, not once per transaction. Once you have that type definition in memory, you can build a single query for your node operation, regardless of how many tables the node data are spread out among. I'll offer a refinement of my previous example for illustration: function retrieve_newer($nodetype, $cutoff_datetime, $comparison='>=') { $querytext = 'SELECT * FROM node_'. $nodetype. ' '; $jointype = 'INNER'; foreach( list_type_components($nodetype) AS $comp) { $querytext .= "$jointype JOIN typecomp_$comp ON node_$nodetype.node_id = typecomp_$comp.node_id". ' '; } /* etc. */ return $resultset; } The developer/user never needs to know about the table structure. All he needs to know is the type identifier, or even just a type component identifier. For SELECTing node data, he gets all the fields together, and when he refers to $mynode[component_name][field_name] he'll get the fields he wants. The only way he wouldn't is if he asked for the wrong node type, and that's hard to do in this example. (As I mentioned previously, you would never even want a node's data without having at least partial knowledge of its type.) So, we'll have *at most* one query per node operation--less with lists of nodes--plus one query per node type per bootstrap. (We may be able to eliminate the latter. I'm about to post a proposed module loading scheme that could reduce or eliminate those.) Now, this does imply a separate query for each node type in any list of nodes. I think maybe we could work to improve even that with the right join setup, but even if I'm wrong, one query per node type ain't so bad.
And of course, I'm seeing a trend toward more CCK-esque nodes, which means fields get split out into separate tables to allow for richer data types and multi-value fields. That complicates things in an entirely different way.
Again, I believe I may not agree. Assume we require that *all* node fields be bundled up in the named groups that I called "node type components", and in the manner of modules like Case Tracker or Category. Assume also that each type-component has its own table, with the relationships I described before. Then, simply by knowing either the node's type identifier _or_ the identifier of any of the node's type components, we have full access to all of a node's data, *even if* we don't have full information beforehand of its type. _Complexity_ becomes much less of an issue at the module-development level. There is the peformance load of table joins to consider, but I'm not worried about it much. For one, we have that now. For another, perhaps that will be offset by the reduced number of queries involved in a node transaction. -Edgar