I'd like to add that some tables have redundant indexes. If MySQL has a key covering (col_a, col_b), it functions perfectly well for queries using col_a only. A special col_a index is unnecessary. This problem occurs in about three to five places, at first glance. Barry Jaspan wrote:
Currently, Drupal defines primary keys for some of its tables but in practice does not really use or rely on them as primary keys. The Schema module, however, is going to use them. I will soon begin development of Migrate, a module for incremental data migration between Drupal systems, and only tables with primary keys will be able to participate.
To that end:
1. node's primary key is wrong.
Currently, the primary key of the node table is (nid, vid), and we have a unique index on (vid). I think this is incorrect. Semantically, if a node has nid 12 and vid 30, we do not think of it as "node 12/30", we think of it as "node 12." The path to access it is node/12, not node/12/30. More precisely, the nid is what uniquely identifies the node as an entity; therefore, it needs to be the primary key.
So, we can change the primary key to be (nid), add a unique index on (nid,vid), and leave the unique index on (vid) too. There will be no performance consequences or code changes outside system.install.
2. Other tables have no primary key.
The tables block, filters, flood, permission, and term_relation have no primary key. They should, and the Schema module will need them to if they are to participate in upcoming functionality.
Note that we do not have to change the code to *use* these primary keys, we just need to add them. Since we have now resolved that using auto-incrementing fields is allowed in Drupal, we do not even need to change module code to maintain the primary key.
These two changes will require a core schema update with patches in system.install. If there is consensus, I'll submit a patch.
Thanks,
Barry