MySQL Character Set
MySQL supports multiple character sets to store and manipulate text in different languages. Understanding character sets is crucial for handling text encoding, collation, and data integrity in MySQL.
1. What is a Character Set in MySQL?
A character set in MySQL defines how characters are stored in a table. It determines the encoding used for text-based data types like CHAR
, VARCHAR
, TEXT
, etc.
Common MySQL Character Sets
Character Set | Description |
---|---|
utf8mb4 | Supports full Unicode (Recommended for modern applications) |
utf8 | Limited Unicode support (Deprecated, use utf8mb4 instead) |
latin1 | Western European character set |
ascii | Supports only basic ASCII characters (0-127) |
ucs2 | Unicode but fixed 2-byte encoding |
utf16 | Unicode with 2-byte encoding |
utf32 | Unicode with 4-byte encoding |
⚠ Important Note:
utf8mb4
should be used instead ofutf8
to store emojis and full Unicode characters.
2. Checking Available Character Sets
To list all character sets available in MySQL, use:
SHOW CHARACTER SET;
To check the default character set for your MySQL server:
SHOW VARIABLES LIKE 'character_set_server';
3. Setting Character Sets
A. Setting Character Set for a Database
You can specify a character set when creating a database:
CREATE DATABASE mydatabase CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
To modify an existing database:
ALTER DATABASE mydatabase CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
B. Setting Character Set for a Table
When creating a table, specify the character set:
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
);
To change the character set of an existing table:
ALTER TABLE users CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
C. Setting Character Set for a Column
Modify the character set for specific columns:
ALTER TABLE users MODIFY name VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
D. Setting Character Set for a Connection
Each client connection can have a different character set:
SET NAMES utf8mb4;
or
SET CHARACTER SET utf8mb4;
To check the character set of the current connection:
SHOW VARIABLES LIKE 'character_set_connection';
4. Choosing the Right Character Set
Use Case | Recommended Character Set |
---|---|
Multi-language support | utf8mb4 |
English-only data | latin1 or ascii |
Legacy applications | latin1 (if required) |
Storing emojis & special characters | utf8mb4 |
5. Handling Character Set Issues
Problem | Solution |
---|---|
Characters appear as ??? or 😊 | Ensure utf8mb4 is used for database, table, and column |
Wrong character encoding in MySQL Workbench | Use SET NAMES utf8mb4; before querying |
Data corruption after migration | Convert tables using ALTER TABLE ... CONVERT TO CHARACTER SET |
Summary
- Always use
utf8mb4
for Unicode and emoji support. - Specify character sets at the database, table, and column levels.
- Use
SHOW CHARACTER SET
andSHOW VARIABLES
to inspect character settings. - Ensure client and server character sets match to avoid encoding issues.
Would you like help converting an existing MySQL database to utf8mb4
? 🚀