Hi all. Algorithmic question. Actually, two questions that I feel have related or complementary solutions, so I am tossing them out together. I have two scenarios I am looking at where I need to be able to pull data in from an external source and make a node out of it, *and* be able to fully update the entire dataset at any time from the external source. Scenario 1: Course listings for a school. Courses come in as a CSV file. The CSV includes duplicate data for courses and sections of courses, as well as fields like the instructor, which is an Instructor node type elsewhere in the system. It is not linked by node ID, however, but by instructor name, and then a lookup needs to be done on import. Typically imports are done a few times a year, but at those times will be done easily a dozen times in a few days. Each import is several hundred records. Scenario 2: A DocBook source tree, such as the one sepeck is building for the new Drupal.org documentation. A given book needs to be made available via Drupal pages, with the structure of the pages on the site matching up with the DocBook tree. Whether pages break at chapters/articles, sections, sub-sections, etc. should be admin controlled. Each book could potentially be hundreds of pages (for some admin-defined definition of page). In neither scenario does the incoming data have any knowledge of "nodes". Also, in neither scenario do I need round-tripping, so if the data is not editable via Drupal, or edits are always overwritten by a new import, that is perfectly fine. Now, the non-node solution to both of these is reasonably straightforward. In the first scenario, simply read course data into two separate tables (one for courses, one for sections) with the appropriate foreign keys, then setup custom menu callbacks for courses and sections and lists thereof that display the desired information. When a new file is imported, flush those tables and rebuild. You probably wouldn't even use auto-increment IDs for them. In the second case, have a single menu callback that corresponds to the root of the Docbook tree. Arguments to that callback map to the structure of the tree. Scan the tree once and build a menu based on the outline, then lazy-load page data and cache it in rendered (via XSLT or whatever) form for later display. If the tree structure changes or the admin changes the page-break settings, flush the cache and rebuild the spine menu. The problem with both of those methods is in neither case is the data a node. That means you do not get any of the benefits of data being nodes (comments, CCK capability, nodereference, a node/$nid "permalink" that doesn't change when the book structure is refactored, Views support, etc.) On the other hand, both cases need flush/rebuild ability. That means creating and destroying nodes by the hundreds on a regular basis -- which would be a very slow operation and would result in the loss of any of that additional metadata -- or building some additional mechanism for tracking what part of the original source data maps to what once-created node. The former is quite undesirable, while the latter is potentially quite complex (especially when, of course, you don't have absolute control over the incoming data so can't guarantee that it has a unique ID you can reference). Since I can think of at least two places I would want to use each of those (Drupal.org being one use case for scenario 2), both seem like natural cases for generalized modules. I want to solve the import/sync problem first, however. I also do not believe that the importexportapi module would be useful here. I looked into it last year for a similar task, and from the documentation determined that it was either too over-engineered or too under-documented for my uses. By the time I figured out how to use it, I could probably just have written it myself. :-/ The DocBook scenario I'm assuming is a Drupal 6-based module only, and PHP 5-only. The courses scenario may be Drupal 5 or Drupal 6, depending on scheduling. So, I toss the brain-teaser out there: Is there a good way to have my nodes and import them too, or are these cases where nodes are simply the wrong tool and the direct-import-and-cache mechanisms described above are the optimal solutions? -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson