homepage >> Safely Converting a MySQL Table to ut    
 

Yossef Benharosh is an apt web developer and the author of the eBook The essentials of object oriented PHP.

Yossef Benharosh web developer profile linkedin twitter github

Safely Converting a MySQL Table to utf8mb4 for Emoji and Multilingual Support

Introduction

If you need to store emojis and multilingual text (such as Hebrew) in a MySQL database, your table must use utf8mb4 instead of utf8. This guide will walk you through safely converting a MySQL table to utf8mb4_unicode_ci while preserving data integrity.

 

Why utf8mb4?

The utf8 character set in MySQL only supports up to 3 bytes per character, which excludes certain Unicode characters, such as emojis and some non-Latin scripts. utf8mb4 supports up to 4 bytes per character, making it the correct choice for full Unicode compatibility.

 

Steps for a Safe Conversion

To prevent data loss, we will follow these steps:

  1. Create a backup of the original table
  2. Create a new table with the correct charset and collation
  3. Copy data from the original table into the new table
  4. Verify data integrity
  5. Swap the new table with the original
  6. (Optional) Remove the old table after validation

 

PHP Script for Automated Conversion

The following PHP script automates the conversion process.

 

Prerequisites

  • MySQL database with an existing table you want to convert
  • PHP installed with PDO enabled
  • Backup access in case of an unexpected failure

 

The PHP Script

<?php
function convertTableToUtf8mb4($pdo, $tableName) {
    try {
        // Define table names
        $backupTable = "{$tableName}_backup";
        $newTable = "{$tableName}_new";
        $oldTable = "{$tableName}_old";

        // 1. Create a backup of the original table
        $pdo->exec("DROP TABLE IF EXISTS $backupTable");
        $pdo->exec("CREATE TABLE $backupTable LIKE $tableName");
        $pdo->exec("INSERT INTO $backupTable SELECT * FROM $tableName");
        echo "Backup created for table: $tableName.\n";

        // 2. Create a new table with utf8mb4 charset and collation
        $pdo->exec("DROP TABLE IF EXISTS $newTable");
        $pdo->exec("CREATE TABLE $newTable LIKE $tableName");
        $pdo->exec("ALTER TABLE $newTable CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci");
        echo "New table created with utf8mb4 collation: $newTable.\n";

        // 3. Copy data from the original table into the new table
        $pdo->exec("INSERT INTO $newTable SELECT * FROM $tableName");
        echo "Data copied to new table: $newTable.\n";

        // 4. Verify data integrity
        $originalCount = $pdo->query("SELECT COUNT(*) FROM $tableName")->fetchColumn();
        $newCount = $pdo->query("SELECT COUNT(*) FROM $newTable")->fetchColumn();

        if ($originalCount != $newCount) {
            throw new Exception("Row count mismatch: $tableName has $originalCount rows, but $newTable has $newCount rows. Aborting conversion.");
        }
        echo "Data integrity verified: both tables have $originalCount rows.\n";

        // 5. Swap table names
        $pdo->exec("RENAME TABLE $tableName TO $oldTable, $newTable TO $tableName");
        echo "Tables swapped: $newTable is now $tableName.\n";

        // 6. Optional: Drop the old table after verification
        // Uncomment the next line when you're confident that the new table is correct.
        // $pdo->exec("DROP TABLE $oldTable");

        echo "Conversion complete for table: $tableName.\n";
    } catch (Exception $e) {
        echo "Error: " . $e->getMessage() . "\n";
        // Optionally, implement additional logic to restore from backup if needed.
    }
}

// Database connection
$dsn = 'mysql:host=localhost;dbname=your_db;charset=utf8mb4';
$pdo = new PDO($dsn, 'your_username', 'your_password', [
    PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8mb4",
]);
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// Specify the table name dynamically
$tableName = 'rt_articles'; // Change this as needed
convertTableToUtf8mb4($pdo, $tableName);

 

Running the Script

  1. Save the script as convert_utf8mb4.php
  2. Replace 'your_db', 'your_username', and 'your_password' with your actual database details.
  3. Run the script from the command line or execute it in a browser.
  4. Check that your data is correctly stored and displayed.
  5. Once validated, you can remove the old table (rt_articles_old).

 

Conclusion

By following this method, you can safely migrate your MySQL tables to utf8mb4 without risking data loss. This ensures your database can properly store emojis and a wide range of multilingual text.

 

Disclaimer

As far as I know, the script on this page is functional and harmless. However, I make no guarantees. I also take no responsibility for any damage that may occur when using the script (the chances of any issues arising are slim, but you never know). Please use this script responsibly and with good judgment.

 

Recommended for you:

Linux terminal cheat sheet

Upload files to Laravel app

Using jQuery and AJAX to dynamically update a page