[development] Let's finally document our schema already!

Barry Jaspan barry at jaspan.org
Thu Aug 9 13:50:46 UTC 2007


Angie,

Glad to see someone else jumping on this idea!  See the issue
http://drupal.org/node/160599 on this exact topic which I posted three
weeks ago, and announced here.  Clearly, your posts get more respect
than mine do. :-) I'll mark my issue as a duplicate.

> Earl came up with this slick idea in the Views 2.0 schema to stick a
> 'description' attribute on each field in the schema file. 

That was me, and it is in the Schema API docs (see
http://drupal.org/node/146939, but I'm glad to see Earl using it. :-)
I am hoping to see Views 2.0 and Schema work closely together for much
synergistic goodness.

> I've taken this idea a couple steps further (by adding table
> descriptions as well as field descriptions, and adding in foreign key
> information), and have started documenting all core schema files here:
> http://drupal.org/node/164983

Again, see http://drupal.org/node/146939.  Be sure to wrap table names
in table and field descriptions in curly braces so that schema.module
can display them as hyperlinks to the table's definition.

> - Schema API folks, to give guidance, esp. on foreign key
> representation.

My guidance is not to include foreign key information as part of this
issue as it will just complicate it and delay getting the description
fields in.

I worked on ref integrity a couple months ago, defined a way to
represent foreign keys in the schema structure, and wrote the code in
the pgsql driver to create the foreign keys.  It was actually quite
easy to do.  However, it turns out that Drupal will need substantial
and careful work before referential integrity can be added including
changes all over core and contrib; for example, see my previous thread
on this list about whether to use RESTRICT or CASCADE deletions and
the impliciations for the rest of system.  Since we cannot safely
*use* foreign key info, I'm not sure it is a good idea to include it
in the schema.

We could include foreign key info "as documentation" but we'd want to
be sure we got the representation right, and it is tricky.  First, the
foreign keys array must be keyed by table name, not field name, to
reasonably support multi-column foreign keys.  Here is the format I
came up with:

$schema['node'] = array(
  'fields' => array(
     'nid' => array('type' => 'int'),
     ...
  ),
);
$schema['node_revisions'] = array(
  'fields' => array(
     'nid' => array('type' => 'int'),
     'vid' => array('type' => 'int'),
     ...
  ),
  'foreign keys => array(
    'node' => array('nid' => 'nid'),
  ),
));

$schema['content_type_foobar'] = array(
   'fields' => array(
     'nid' => array('type' => 'int'),
     'vid' => array('type' => 'int'),
     ...
   ),
   'foreign keys' => array(

    // redundant, just shows you can have more than one:
    'node' => array('nid' => 'nid'),

    // could just be on vid, but shows you can have multiple columns:
    'node_revisions' => array('nid' => 'nid', 'vid' => 'vid'),
   ),
));

As shorthands, when the column name is the same in both tables you can
just specify it as value, not as key => value, and when there is
exactly one column in the foreign key and its name is the same in both
tables, you can specify it as a string instead of an array.  So,
content_type_foobar could equivalently contain:

   'foreign keys' => array(
     'node' => 'nid',
     'node_revisions => array('nid', 'vid'),
   )

HOWEVER, there is more to it than this.  For example, we need to be
able to specify JOINS in the schema that are not representable as
foreign keys.  For example, we have tables that contain an "object id"
(int) and a "object type" (string, e.g. "node").  The join is "object
id = node.nid WHERE object type = 'node'".  That can't be a foreign
key because it is conditional, but we should still represent the
relationship in the schema so we can automatically load nodes
including this table's information.  I haven't thought about the best
way to do that yet.

Bottom line, don't hold up schema descriptions until we figure out how
to represent foreign keys/joins.

Thanks,

Barry



More information about the development mailing list