Valery's Mlog

Mindlog of a Freak
December 9th, 2005 by Valery Dachev

MySQL Upgrade

How to migrate from MySQL 4.0 to MySQL 4.1/5.x ? Umm… it’s hard. The most noticeable changes are two:

  • the contents of the TIMESTAMP field is no longer “20051209215200” – it’s something like “2005-12-09 21:52:00”. The tricks of relying on this format don’t work anymore and the way of processing it should be changed;
  • before 4.1 appeared we had a default charset which didn’t make much sense. Now we have character sets and collations at database, table and field level.

As far as the side effects of the first change can be fixed easily, this is not that trivial in the second case. By default, MySQL 4.0 uses latin1 charset, but a different one can be specified with the default_charset option in my.cnf. There is no noticeable effect on the contents or the work of the client. Maybe on the collations only (I may be wrong here) and on data ordering repectively. After migration to 4.1 this charset makes much more sense, especially when changing the charset of a column. For example, when a cp1251 information was stored in a latin1 column. If I try to execute the query:

ALTER TABLE `table` CHANGE `field` `field` varchar(32) DEFAULT NULL CHARACTER SET cp1251;

in order to fix the charset of the field, its contents will be inproperly converted which, as a result, would lead to data loss. This coverted takes place every time when the character set is changed from one to another. However there’s a trick, which can be found at http://dev.mysql.com/doc/refman/4.1/en/charset-conversion.html and it is to change the column type to binary and then back to the initial one but with the desired character set. This way no convertion takes place:
ALTER TABLE `table` CHANGE `field` `field` binary(32);<br />
ALTER TABLE `table` CHANGE `field` `field` varchar(32) DEFAULT NULL CHARACTER SET cp1251;

At the bottom of the page there’s a PHP script written by Shimon Doodkin which passes all databases, tables and fields and makes the changes mentioned above. It was the piece of code I based MySQL Charsets on. This tool does the same job, but it additionally gives you the ability to limit its work on a single database or table, to visually choose a charset and/or collation, to generate a script for conversion and some more features. It works very well for me and I’d be glad if someone else take advantige over it :)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: