Reconciling mysql & pgsql schemas
The Schema core patch (http://drupal.org/node/136171) to provide a base level of database schema abstraction is coming along nicely. This will include defining a table definition structured array so that manual CREATE TABLE statements will be history as well as the suite of db_{add,remove}_{field,key,index} so that ALTER TABLE statements will also be history. Existing hook_update_n() functions and the entire existing "schema version" system will be unaffected. One issue that has come to light is that our existing mysql and pgsql database schemas, as represented in the CREATE TABLE statements, are different. The differences are pretty minor: signed vs unsigned, default vs no default, int vs smallint. However, since the goal is to create a single data structure representing each module's tables, I think we should resolve the differences. I've listed all of the discrepancies below. This list was derived by: 1. Installing HEAD on mysql (with current CREATE TABLE statements) with all optional modules enabled. 2. Automatically generating the database schema structure by inspecting the mysql database and SAVING it in each module's hook_schema. 3. Installing HEAD on pgsql (with current CREATE TABLE statements) with all optional modules enabled. 4. Automatically generating the database schema structure by inspecting the pgsql database and COMPARING it (NOT saving it) against each module's hook_schema. As an example, the first mismatched table is aggregator_feed. The comparison report translates as: "Table aggregator_feed has a column named 'description'. hook_schema for this table says the column has no default value but the actual database column has a default value of ''." Sure enough, if you look at the mysql and pgsql CREATE TABLE statements, you will see that mysql's (which is what was used to create the "declared" structure) has no default while pgsql's (which is what was used to create the "actual" structure) has one. Fixing this is a two-step process. First, we must decide which schema (mysql's or pgsql's) is right in each case. Then, either: 1. write a system_update function to modify each mismatching table into conformance, or 2. put db-specific values into the schema data structure, like this: $schema['aggregator_feed']['fields']['description'] = array( 'type' => 'text', 'not null' => 1, 'pgsql_default' => ''); Note the 'pgsql_default' attribute. We can also use a combination of 1 and 2 if necessary. I propose using 1 whenever possible and 2 only if/when necessary. Comments? The comparison report (compliments of schema.module) follows. Thanks, Barry Tables for which the schema and database are different. aggregator * aggregator_feed o column description: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') o column image: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') o unique keys link: missing in database o unique keys url: unexpected (not an error) block * blocks o column pages: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') * boxes o column bid: declared: array('type' => 'int', 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o column bid: declared: array('type' => 'int', 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o column format: declared: array('type' => 'int', 'not null' => 1, 'default' => 0) actual: array('type' => 'int', 'size' => 'small', 'not null' => 1, 'default' => 0) comment * comments o column format: declared: array('type' => 'int', 'not null' => 1, 'default' => 0) actual: array('type' => 'int', 'size' => 'small', 'not null' => 1, 'default' => 0) o indexes lid: missing in database o indexes nid: unexpected (not an error) * node_comment_statistics o column nid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) contact * contact o column cid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o column recipients: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') o column reply: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') dblog * watchdog o column location: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') drupal * client o column cid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) locale * locales_source o column source: declared: array('type' => 'blob', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1) o column source: declared: array('type' => 'blob', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1) o indexes source: unexpected (not an error) * locales_target o column translation: declared: array('type' => 'blob', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1) o column translation: declared: array('type' => 'blob', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1) o indexes lang: missing in database o indexes language: unexpected (not an error) node * node o column nid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o indexes node_title_type: declared: array('title', 'type') actual: array('title') o indexes node_type: missing in database * node_revisions o column vid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o column vid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o column vid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o column body: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') o column teaser: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') o column log: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '') poll * poll_choices o column chid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) search * search_index o column score: declared: array('type' => 'float', 'not null' => ) actual: array('type' => 'float', 'size' => 'big', 'not null' => ) * search_total o column count: declared: array('type' => 'float', 'not null' => ) actual: array('type' => 'float', 'size' => 'big', 'not null' => ) system * authmap o column aid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) * cache o column serialized: declared: array('type' => 'int', 'not null' => 1, 'default' => 0) actual: array('type' => 'int', 'size' => 'small', 'not null' => 1, 'default' => 0) * cache_filter o column serialized: declared: array('type' => 'int', 'not null' => 1, 'default' => 0) actual: array('type' => 'int', 'size' => 'small', 'not null' => 1, 'default' => 0) * cache_page o column serialized: declared: array('type' => 'int', 'not null' => 1, 'default' => 0) actual: array('type' => 'int', 'size' => 'small', 'not null' => 1, 'default' => 0) * files o column fid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1) o column fid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1) o column fid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1) o column fid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1) * menu o indexes pid: declared: array('pid') actual: array('parent') * url_alias o column pid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) o unique keys dst_language: missing in database taxonomy * term_data o column tid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) * term_node o primary key: declared: array('vid', 'tid', 'nid') actual: array('tid', 'nid', 'vid') * term_synonym o indexes name: missing in database * vocabulary o column vid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) user * role o column rid: declared: array('type' => 'serial', 'unsigned' => 1, 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) * users o column uid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1) o column uid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1) o column uid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1) o column uid: declared: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0) actual: array('type' => 'serial', 'not null' => 1)
Hi Barry, Cool stuff. Just a minor random comment: On 14 May 2007 14:23:43 -0400 Barry Jaspan <barry@jaspan.org> wrote:
* aggregator_feed o column description: declared: array('type' => 'text', 'not null' => 1) actual: array('type' => 'text', 'not null' => 1, 'default' => '')
As for MySQL, this is a bug that was fixed in 5.0.25. By definition, TEXT fields can't have default values: "TEXT and BLOB columns do not support DEFAULT values. However, when a default of '' was specified, the specification was silently ignored. This now results in a warning, or an error in strict mode." Full details here: http://bugs.mysql.com/bug.php?id=19498 The actual source patch: http://lists.mysql.com/commits/9310 And a request to restore the pre-5.0.25 "feature" of allowing DEFAULT '' on text fields that has not been accepted: http://bugs.mysql.com/bug.php?id=21532 Cheers, -Jeremy
Barry, if I understand the patch right, things like the following are not actual mismatches, are they ? * boxes o column bid: declared: array('type' => 'int', 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1) Y. Ch.
if I understand the patch right, things like the following are not actual mismatches, are they ?
* boxes o column bid: declared: array('type' => 'int', 'not null' => 1) actual: array('type' => 'serial', 'not null' => 1)
Yes, they are. The former is the result of $schema['boxes']['fields']['bid'] = array('type' => 'int', ...); and the latter is the result of $schema['boxes']['fields']['bid'] = array('type' => 'serial', ...); The latter translates to "int auto_increment" on mysql and "serial" on pgsql. The data type in the column is the same as if the type were "int" but the behavior is different. Most importantly, the code that manipulates the 'boxes' table either treats the bid field as auto-incrementing or it does not, and the field definition and the code should agree. In this case, since the system works with type 'int' on mysql, clearly the code is not treating the field as auto-incrementing. Actually, what is going on here is that pgsql's db_next_id() assumes that a sequence object named <table>_<field>_seq exists. So we aren't using the auto-incrementing nature of the column but we are using the fact that creating a serial column automatically creates a sequence with a well-known name. The easy solution to this particular class of mismatches is to declare the columns to be type 'serial'. 'serial' then means both (a) the column is auto-incrementing and (b) you are allowed to use db_next_id() on it. As per a previous message I posted here, though, you must ALWAYS do one or the other, never both. Barry
On 14 May 2007, at 8:23 PM, Barry Jaspan wrote:
As an example, the first mismatched table is aggregator_feed. The comparison report translates as: "Table aggregator_feed has a column named 'description'. hook_schema for this table says the column has no default value but the actual database column has a default value of ''." Sure enough, if you look at the mysql and pgsql CREATE TABLE statements, you will see that mysql's (which is what was used to create the "declared" structure) has no default while pgsql's (which is what was used to create the "actual" structure) has one.
It's late, and my memory is a bit spotty right now, but iirc If you don't specify a default value in postgres, if you make a query , it will cause the query to fail. The default is basically like a 'required'.
adrian rossouw <adrian@bryght.com> writes:
It's late, and my memory is a bit spotty right now, but iirc If you don't specify a default value in postgres, if you make a query , it will cause the query to fail.
The default is basically like a 'required'.
I'm not sure if you are saying (a) that specifying a default value in CREATE TABLE is required or (b) that if you specify a default value in CREATE TABLE then all subsequent INSERTs into that table will require that you specify the default value or (c) something else. When you wake up, please clarify. :-) Barry
On 15 May 2007, at 1:59 PM, Barry Jaspan wrote:
When you wake up, please clarify. :-)
Please ignore what I said. I only faintly recall getting up sometime in the early morning and replying to mail. I need to go find out what I sent now ... It's still not as weird as the time I woke up from a lucid dream state realizing that I was trying to speak japanese in my sleep (Note: i don't know japanese)
participants (4)
-
adrian rossouw -
Barry Jaspan -
Jeremy Andrews -
Yves Chedemois