MySQL Character Sets & Collations

Check

Table collation...

SHOW TABLE STATUS LIKE 'mytable'\G

*************************** 1. row ***************************           Name: mytable         Engine: InnoDB        Version: 10     Row_format: Dynamic           Rows: 81772 Avg_row_length: 211    Data_length: 17317888Max_data_length: 0   Index_length: 6864896      Data_free: 4194304 Auto_increment: 82706    Create_time: 2021-03-09 15:37:35    Update_time: 2023-01-26 10:00:44     Check_time: NULL      Collation: utf8_general_ci       Checksum: NULL Create_options:        Comment:1 row in set (0.00 sec)

Table character set...

SHOW CREATE TABLE mytable\G

CREATE TABLE `mytable` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `myid` varchar(5) DEFAULT NULL,  `mycolumn` varchar(50) DEFAULT NULL,  PRIMARY KEY (`id`),  UNIQUE KEY `idx_mytable_myid` (`myid`)) ENGINE=InnoDB AUTO_INCREMENT=82706 DEFAULT CHARSET=utf8 COMMENT='     '

Instance default character sets...

status

mysql  Ver 8.0.23-commercial for Linux on x86_64 (MySQL Enterprise Server - Commercial)
Connection id:          48486048Current database:Current user:           root@localhostSSL:                    Not in useCurrent pager:          stdoutUsing outfile:          ''Using delimiter:        ;Server version:         8.0.23-commercial MySQL Enterprise Server - CommercialProtocol version:       10Connection:             Localhost via UNIX socketServer characterset:    utf8mb4Db     characterset:    utf8mb4Client characterset:    utf8mb4Conn.  characterset:    utf8mb4UNIX socket:            /mysql/mysql.sockBinary data as:         HexadecimalUptime:                 559 days 2 hours 13 min 34 sec
Threads: 5  Questions: 325482610  Slow queries: 7908120  Opens: 1698  Flush tables: 3  Open tables: 1370  Queries per second avg: 6.737
See the troubleshooting case study for an interesting issue caused by a character set mismatch. 
In this case the mismatch is down to the difference in default character set between MySQL 5.7 and 8.0 when the table is created from the same DDL which explictly sets trhe table character set to utf8

Database character set and collation...

USE mydatabase;

SELECT @@character_set_database,

       @@collation_database; 

SELECT default_character_set_name,

       default_collation_name

  FROM information_schema.schemata

 WHERE schema_name = 'mydatabase'; 

To see available character sets and their default collation...

SHOW CHARACTER SET;

To see all available collations for a character set...

SELECT collation_name,

       pad_attribute

  FROM information_scheme.collations

 WHERE character_set_name = 'utf8mb4';

SHOW COLLATION WHERE Charset = 'utf8mb4';

Unicode...

utf8mb4

utf8mb3

utf8

ucs2

utf16

utf16le

utf32

uses 1 to 4 bytes per character

uses 1 to 3 bytes per character (deprecated in MySQL 8.0)

alias for utf8mb3 (may eventually become an alias for utf8mb4)

uses 2 bytes per character (deprecated in MuSQL 8.0.28)

use 2 to 4 bytes per character

Little-endian version of utf16

uses 4 bytes per character 

Some notes about collation in MySQL...

In collation names, you will see the following suffixes...

_ai

_as

_ci

_cs

_bin

Accent-insensitive

Accent-sensitive

Case-insensitive

Case-Sensitive

Binary

National Character Set

These statements are equivalent...

SELECT N'mytext';

SELECT n'mytext';

SELECT _utf8'mytext';

Change

Database

ALTER DATABASE mydatabase CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;

Table

ALTER TABLE mytable CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Column

ALTER TABLE mytable CHANGE mycolumn VARCHAR(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci

Bibliography