Normally, the server runs using the latin1
character set by default. If you have been storing column data
that actually is in some other character set that the 4.1
server supports directly, you can convert the column. However,
you should avoid trying to convert directly from
latin1
to the "real" character set. This
may result in data loss. Instead, convert the column to a
binary data type, and then from the binary type to a nonbinary
type with the desired character set. Conversion to and from
binary involves no attempt at character value conversion and
preserves your data intact. Suppose that you have a 4.0 table
with three columns that are used to store values represented
in latin1
, latin2
, and
utf8
:
CREATE TABLE t ( latin1_col CHAR(50), latin2_col CHAR(100), utf8_col CHAR(150) );
For MySQL 4.1, you want to convert this table to leave
latin1_col
alone but change the
latin2_col
and utf8_col
columns to have character sets of latin2
and utf8
. Before upgrading to 4.1, back up
your table, then convert the columns as follows:
ALTER TABLE t MODIFY latin2_col BLOB; ALTER TABLE t MODIFY utf8_col BLOB;
Then, after upgrading to 4.1, complete the conversion by issuing these statements:
ALTER TABLE t MODIFY latin2_col CHAR(100) CHARACTER SET latin2; ALTER TABLE t MODIFY utf8_col CHAR(150) CHARACTER SET utf8;
The first two statements “remove” the character
set information from the latin2_col
and
utf8_col
columns. The second two statements
assign the proper character sets to the two columns.
If you like, you can combine the to-binary conversions and from-binary conversions into single statements. In MySQL 4.0, do this:
ALTER TABLE t MODIFY latin2_col BLOB, MODIFY utf8_col BLOB;
After upgrading to 4.1, do this:
ALTER TABLE t MODIFY latin2_col CHAR(100) CHARACTER SET latin2, MODIFY utf8_col CHAR(150) CHARACTER SET utf8;
If you can ensure that the tables will not otherwise be
modified before you perform the character set conversion, you
can issue all of the ALTER
TABLE
statements after upgrading to MySQL 4.1.
If you specified attributes when creating a column initially,
you should also specify them when altering the table with
ALTER TABLE
. For example, if
you specified NOT NULL
and an explicit
DEFAULT
value, you should also provide them
in the ALTER TABLE
statement.
Otherwise, the resulting column definition will not include
those attributes.
User Comments
<?php
/* $Id: mysqlupgrade.php,v 1.3 2005/01/31 22:04:02 shimon Exp $ */
// upgrade CHARACTER SET for MySQL 4.1.0 +
//
// Did you export all databases including mysql database before runing this file ?
//
// known bug of this program it dont know to treat FULLTEXT index
//
//by Shimon Doodkin shimon_d@hotmail.com
$conn = mysql_connect("localhost", "mashovim.co.il", "***");
$printonly=true; //change this to false to alter on the fly
$charset="hebrew";
$collate="hebrew_general_ci";
$altertablecharset=true;
$alterdatabasecharser=true;
function PMA_getDbCollation($db)
{
$sq='SHOW CREATE DATABASE `'.$db.'`;';
$res = mysql_query($sq);
if(!$res) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
if($row = mysql_fetch_assoc($res))
{
$tokenized = explode(' ', $row[1]);
unset($row, $res, $sql_query);
for ($i = 1; $i + 3 < count($tokenized); $i++)
{
if ($tokenized[$i] == 'DEFAULT' && $tokenized[$i + 1] == 'CHARACTER' && $tokenized[$i + 2] == 'SET')
{
if (isset($tokenized[$i + 5]) && $tokenized[$i + 4] == 'COLLATE')
{
return array($tokenized [$i + 3],$tokenized[$i + 5]); // We found the collation!
}
else
{
return array($tokenized [$i + 3]);
}
}
}
}
return '';
}
?>
<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1251<? //remember to change it if needed ?>" />
<xmp>
<?
$rs2 = mysql_query("SHOW DATABASES");
if(!$rs2) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
while ($data2 = mysql_fetch_row($rs2))
{
$db=$data2[0];
$db_cha=PMA_getDbCollation($db);
if ( $db!='mysql' /* &&( $db=='mydb_x' || $db=='mydb_y' ) */ ) // limit to database(s) - $db=='mydb_x' || $db=='mydb_y' || $db=='mydb_z'
if ( substr($db_cha[0],0,4)!='utf8' ) // limit to charset
{
mysql_select_db($db);
$rs = mysql_query("SHOW TABLES");
if(!$rs) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
while ($data = mysql_fetch_row($rs))
{
$rs1 = mysql_query("show FULL columns from $data[0]");
if(!$rs1) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
while ($data1 = mysql_fetch_assoc($rs1))
{
if(in_array(array_shift(split("\\(",$data1['Type'],2)),array(
//'national char',
//'nchar',
//'national varchar',
//'nvarchar',
'char',
'varchar',
'tinytext',
'text',
'mediumtext',
'longtext',
'enum',
'set'
)))
{
if(substr($data1['Collation'],0,4)!='utf8') // limit to charset
{
$sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type'].' CHARACTER SET binary '.($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').';';
if(!$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
else
{
echo ($sq."\n") ;
$sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type']." CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate").($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').($data1['Comment']==''?'':' COMMENT \''.mysql_escape_string($data1['Comment']).'\'').';';
if(!$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
else echo ($sq."\n") ;
}
}
}
}
if($altertablecharset)
{
/*
$sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET binary";
echo ($sq."\n") ;
if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
*/
$sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");
echo ($sq."\n") ;
if(!$printonly)
if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
}
}
if($alterdatabasecharser)
{
/*
$sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET binary";
echo ($sq."\n") ;
if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
*/
$sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");
echo ($sq."\n") ;
if(!$printonly)
if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
}
}
}
?>
</xmp>
Is this MODIFY query correct? Only way that i was able to get proper sorting on utf8/utf8_polish_ci table is:
ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8;
like said on page:
http://dev.mysql.com/doc/mysql/en/alter-table.html
I've a problem with this method, at least going from latin1_swedish_ci to utf8_general_ci, when switching back to varchar after changing the charset I receive errors on unique fields where it thinks Éleanore and Eleanore are the same (note the É ) Not sure if this is a bug (which it looks like) or if I've missed something that isn't covered with this method.
About the script posted by Shimon Doodkin.
The FULLTEXT limitation can be avoided with:
ALTER TABLE `file_section` DISABLE/ENABLE KEYS between every table changes.
This script also attempts to change collation of MySQL system databases which is not desired in most cases.
I guess, there is more elegant solution to convert specified databases to specified charset (including conversion of all table text fields to specified charset if that is what you want):
<?php
// Script written by Vladislav "FractalizeR" Rastrusny
// http://www.fractalizer.ru
//MySQL connection settings
$db_server = 'localhost';
$db_user="root";
$db_password="";
mysql_connect($db_server, $db_user, $db_password) or die(mysql_error());
//Put here a list of databases you need to change charset at or leave array empty to change all existing
$dblist=array();
//If changing at all databases, which databases to skip? information_schema is mysql system databse and no need to change charset on it.
$skip_db_list = array('information_schema', 'mysql');
//Which charset to convert to?
$charset="utf8";
//Which collation to convert to?
$collation="utf8_general_ci";
//Only print queries without execution?
$printonly=true;
//Getting database names if they are not specified
$skip_db_text = '"'.implode('", "', $skip_db_list).'"';
if(count($dblist)<1) {
$sql="SELECT GROUP_CONCAT(`SCHEMA_NAME` SEPARATOR ',') AS FRST FROM `information_schema`.`SCHEMATA` WHERE `SCHEMA_NAME` NOT IN ($skip_db_text)";
$result = mysql_query($sql) or die(mysql_error());
$data = mysql_fetch_assoc ($result);
$dblist=explode(",", $data["FRST"]);
}
//Iterating databases
foreach ($dblist as $dbname) {
$sql="SELECT CONCAT('ALTER TABLE `', t.`TABLE_SCHEMA`, '`.`', t.`TABLE_NAME`, '` CONVERT TO CHARACTER SET $charset COLLATE $collation;') as FRST FROM `information_schema`.`TABLES` t WHERE t.`TABLE_SCHEMA` = '$dbname' ORDER BY 1";
$result = mysql_query($sql) or die(mysql_error());
while ($row = mysql_fetch_assoc($result)) {
echo $row["FRST"]."\r\n";
if(!$printonly) {
mysql_query($row["FRST"]) or die(mysql_error());
}
}
}
?>
You can change all databases' charset or only selected. You can select databases to skip. You can also just print queries and execute them via PHPMyAdmin for example.
The manual page states that: "This procedure requires that the table not have been modified already with statements such as INSERT or UPDATE after an upgrade to MySQL 4.1 or later. In that case, MySQL would store new values in the column using latin1, and the column will contain a mix of sjis and latin1 values and cannot be converted properly."
This applies to tables which have rows with different character sets. While the above statement is probably true for the sjis and latin1 character sets, it turns out that if you have a mix of latin1 and utf-8 in a single table there's probably a "clean" way to fix this. We had to do this recently for a new customer of ours, and since the process was somewhat of a pain to come up with, I wrote about it here (in the hope that I can save someone else out there some time): http://www.blueboxgrp.com/news/2009/07/mysql_encoding
Very easy in SHELL...
In my exemple, I ALTER to UTF8 CHARSET. TO RUN SCRIPT:
./altera_colation.sh NAME_DB
----- HERE ----------
#!/bin/bash
USER='admin';
PASS='admin';
BANCO=$1;
QUERY="SELECT table_name FROM information_schema.TABLES WHERE table_schema = '$BANCO';";
TABELAS=$(mysql -u $USER --password=$PASS $BANCO --execute="$QUERY" | sed 's/|//g' | tail -n +2)
var=0;
for tables in $TABELAS; do
var=`expr $var + 1`
echo "ALTER TABLE $tables ......"
mysql -u $USER --password=$PASS $BANCO -e "ALTER TABLE $tables CONVERT TO CHARSET utf8 COLLATE utf8_unicode_ci"
done
------------ END ------------
Add your own comment.