In the discussion of whether everything should be a node, there are two recurring objections. I believe both can be addressed. The 3 numbered paragraphs below describe a data model that responds to the objections. Then, the paragraphs following look at additional implications. First objection is that database performance would be a nightmare if everything were stored in one table. I agree. However, it is not necessary to keep everything in one table. The second objection is stated in different ways, so I'll synthesize: data nodes, taxonomy vocabularies and terms, and other things like comments and users, are all different _kinds_of_things_. It doesn't make sense to treat them the same. I respond by saying that they *are* the same thing: they are data structures with most fields and operations in common. The fact that they don't _mean_ the same thing to us meatpeople isn't relevant to the question of storage and low-level API's, only to the implementation of each type's handler functions. Here's what I have in mind. 1. Put each node type in its own table, and name the table for the type, e.g. {node_footype}. Each node-type table will have columns for the fields that are absolutely universal to all nodes, at a minimum the globally unique node ID, plus timestamps, etc. Note that the node type is not stored in a column in this table, as it is invariant across rows; the typename is implicit in the table name. (In memory, of course, a node's type identifier would need to be recorded.) 2. Each "node type component" that bundles its own fields and defines its own handlers will also have its own table. For example, Case Tracker and Category implement fields and behavior that can be attached to any node type. This one will only have the fields that pertain to the add-on functionality. Again, the type is not recorded in the table data; it is implicit in the table's name, e.g. {typecomp_casetracker}. 3. The fields that are added to the node type's definition individually would be an implicit node type component, so we'll put those in {typecomp_footype}, corresponding to {nodetype_footype}. For example, the fields we would add with CCK, or in a module that defines a complete and independent node type. [Note this implies that node types and type components share a namespace. I don't believe that's a problem. It makes sense for a node type component to have dibs on the base node type of the same name. (If a node type name is already taken, you need to choose a different name for your module anyway.)] So, the definition of a node type is recorded as a type-name and a list of the associated add-ons, the type components. A type component is defined with a name and a list of field names and types, presumably from a catalog of field type definitions recorded elsewhere. For any node type, regardless of the meaning of its data, whether those are content or categories or whatever, the database operations are the same. For example, I want to retrieve all of one node's fields. I the programmer don't even necessarily know what they all are, and I don't need to. (Please excuse my sloppy syntax.) $querytext = 'select * from node_'. $nodetype. ' '; $jointype = 'inner'; foreach( list_type_components($nodetype) as $comp) { $querytext .= "$jointype join typecomp_$comp on node_$nodetype.node_id = typecomp_$comp.node_id". ' '; } $querytext .= 'where blah blah whatever' If the fieldnames are encoded from the type component's name, e.g. my_component_name.text_field1, the developer will never need to know or care about the storage mechanism, as all he'll ever give or get is an associative array. Each type component's handler would add/remove/whatever the node's ID and type to an index table owned by that type component. This takes care of the case where we want to manipulate a node or nodes from the database without knowing their type, and that's when we want to work with all of one metatype (taxo term, content, etc.). For each metatype, create a node type component that's exclusive to it, and *POOF!*. All the semantical differences among node types take care of themselves in the naming conventions, and where necessary in the handler functions. I believe this mechanism might allow us to avoid the major database performance issues, particularly the self-join problem. It should also simplify the engine by merging all our current data-metatypes into only one basic abstract type with one API, without introducing confusion for the module-level developers. -Edgar