[drupal-devel] Crucial problem with mysql 4.1.x and collation

Khalid B kb at 2bits.com
Wed Sep 7 00:35:29 UTC 2005


I did not face something similar, but not the same. My hosting account
is at MySQL 4.0.25, while my development server is 4.1.11.

When doing a dump from development and trying to load it on the
hosting account, I got an error that the phrase DEFAULT CHARSET is not
valid.

So, I resorted to doing:

gunzip -c dbdump.sql.gz|sed -e 's/DEFAULT CHARSET=latin1//'| mysql
-uusername -ppassowrd dbname

Hence removing that part upon loading.

I did not noticed a problem with accented characters though. Maybe I
should check closer.

On 9/6/05, Abalieno <abalieno at cesspit.net> wrote:
> This is a rather serious problem, from my point of view, that I foresee
> getting rather widespread if not solved quickly. After a few hours of
> research and an headache here's what I discovered:
> 
> - Mysql 4.1.x adds the possibility to set a collation for the database. This
> seems a new feature that wasn't there before.
> 
> - By default it seems that every database created or imported is
> automatically set to "latin1_swedish_ci". The whole database is set with
> that collation as you install drupal under that version of mysql or import a
> previous dump.
> 
> - This is causing a serious corruption in the database while exporting it
> because accented and other utf-8 characters are just NOT COMPATIBLE with the
> latin1_swedish_ci set.
> 
> - This means that if you install drupal on mysql 4.1.x, the very first time
> you export the database for a backup or whetever else, you'll finish with a
> corrupted dump because all the accented characters in the nodes, comments
> and aggregator items will get replaced with GARBAGE. As ->
> "Saturday’s Teen People" in the place of "Saturday 's Teen
> People" This is taken from my now broken database and since I noticed this
> too late I now cannot do anything if not manually change every single entry.
> How fun.
> 
> - In the handbook, install.txt and all the other install guides for Drupal
> THERE IS NO MENTION of the collation. This means that it's written nowhere
> how to set the collation and so everyone just follows the standard
> instructions and finishes with a "latin1_swedish_ci" as it happened to me.
> Including the text in the aggregator items, node and comment bodies.
> 
> - How the hell the DB must be configured now? Because from what I read here:
> http://drupal.org/node/15746#comment-36443 It's not even possible to set
> Drupal to use utf8 because it's still not compliant.
> 
> So, beside having my database now unrecoverable, how should I set it to have
> it working properly from now on and be able to back up it without getting
> unrecoverable garbage text?
> 
> Then I seriously suggest you to patch the guides and the drupal package to
> stop this or it would become a rather large problem considering that
> following step by step the instruction you unavoidably go toward this
> corruption problem.
> 
> - HRose / Abalieno
> 
>


More information about the drupal-devel mailing list