Details
-
Bug
-
Resolution: Fixed
-
None
-
Critical
-
-
Change all utf8 fields to the mysql datatype utf8mb4.
-
Empty show more show less
Description
Reason: https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434
change all to utf8mb4:
- in confluence documentation
- in preconditioncheck
- modify all columns / tables via alter script
Manuel guide for utf8 to utf8mb4 migration.
1. Change the database character set and collation
ALTER DATABASE `<database name>` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
2. Disables the foreign keys check
SET FOREIGN_KEY_CHECKS=0;
3. Change the field types.
ALTER TABLE `LEK_term_proposal` CHANGE COLUMN `term` `term` TEXT CHARACTER SET 'utf8mb4' NOT NULL DEFAULT '' COMMENT 'the proposed term' ; ALTER TABLE `LEK_terms` CHANGE COLUMN `term` `term` TEXT CHARACTER SET 'utf8mb4' NOT NULL DEFAULT '' ; ALTER TABLE `LEK_plugin_segmentstatistic_terms` CHANGE COLUMN `term` `term` TEXT CHARACTER SET 'utf8mb4' NOT NULL DEFAULT '' ; ALTER TABLE `LEK_term_history` CHANGE COLUMN `term` `term` TEXT CHARACTER SET 'utf8mb4' NOT NULL DEFAULT '' ;
4. This script will auto-generate the alter scripts for all tables for character set and collation migration from
utf8 to utf8mb4 and utf8_general_ci to utf8mb4_unicode_ci. Execute all result scripts after step 5.
SELECT CONCAT( 'ALTER TABLE ', table_name, ' CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ', 'ALTER TABLE ', table_name, ' CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ') as r FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C WHERE C.collation_name = T.table_collation AND T.table_schema = <database name> AND (C.CHARACTER_SET_NAME != 'utf8mb4' OR C.COLLATION_NAME not like 'utf8mb4%
5. This script will auto-generate the alter scripts for migration from utf8_bin to utf8mb4_bin for all table columns where the table collation is utf8_bin.
Execute this script after the scripts from the step 4 are executed.
SELECT CONCAT('ALTER TABLE ', t1.table_name, ' MODIFY ', t1.column_name, ' ', if(lower(t1.data_type) = 'varchar',concat(t1.data_type,'(' , CHARACTER_MAXIMUM_LENGTH, ')'),t1.data_type), ' CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;' )as r FROM information_schema.columns t1 WHERE t1.TABLE_SCHEMA like <database name> AND t1.COLLATION_NAME IS NOT NULL AND t1.COLLATION_NAME IN ('utf8_bin');
6. Client character set
SET NAMES utf8mb4;
7. Generate repair and optimize queries for all database tables.
SELECT CONCAT( 'REPAIR TABLE ', table_name, '; OPTIMIZE TABLE ',table_name,';') as r FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C WHERE C.collation_name = T.table_collation AS AND T.table_schema = <database name> ;
Make sure to set the client and server character set as well.
Add In my.cnf:
[client] default-character-set = utf8mb4 [mysql] default-character-set = utf8mb4 [mysqld] character-set-client-handshake = FALSE character-set-server = utf8mb4 collation-server = utf8mb4_unicode_ci
Additional info:
character-set-client-handshake=FALSE
With this the server will ignore character set set from the client. With this we will assure that utf8mb4 will be used always