Using RDS Blue/Green Deployments to Change MySQL Table Row Format

23rd November 2023

Before MySQL v5.7, the default row format was COMPACT. In v5.7 this default was changed to DYNAMIC. The DYNAMIC row format offers a number of advantages over COMPACT around storage of long variable length columns and large index key prefixes that you can read about here.

If, like us, your MySQL database has been around since before v5.7, any tables created in earlier versions will have a row format of COMPACT. If we want to take advantage of the improvements brought by the DYNAMIC format, we need to go through the process of migrating tables from one row format to another.

Unfortunately, changing the ROW_FORMAT of a table does not utilise the Instant algorithm introduced in MySQL v8.0, meaning it requires a table rebuild. For large tables this can result in locking tables for an unacceptable amount of time, impacting customers. Thankfully, with our database in AWS RDS, we can utilise Blue/Green deployments to allow us to update all the required tables, with minimal impact to end-users, restricted to the period of time taken for a Blue/Green deployment switchover.

By default, the green (replica) environment of a Blue/Green deployment is read-only. The most common reason to perform Blue/Green deployments is to upgrade software versions, so making the instance writeable is unnecessary. Keeping it read-only reduces the likelihood of creating inconsistencies between environments. To get around this, we need to create our Blue/Green deployment using a custom parameter group that changes the value of the read_only parameter to 0. This allows us to perform the ALTER TABLE statements on the Green (replica) environment without locking tables in the Blue (production) environment.

Tip: Delete Read Replicas

Our production MySQL instance has a couple of read-replicas associated. When you create a Blue/Green deployment it also creates read-replicas to match your current environment.

Unfortunately, when we perform the switch-over, these replicas are switched in series. So every replica increases the amount of time for the switch and the potential for impacting customers. If possible, re-point anything that uses these replicas to the main instance and delete any replicas before creating the Blue/Green deployment.

The suggested method of changing the row format of a table seems to be to perform the ALTER TABLE t1 ROW_FORMAT = DYNAMIC; statement, but there are a couple of down-sides to this;

  • Performing the ALTER TABLE statement explicitly sets a value of row_format=DYNAMIC under Create_options for the table, even though it matches the current default. Setting an explicit row format on the table is something we may need to be aware of if we ever choose to change row formats in the future.
  • The ALTER TABLE statement can only target a single table. If we have hundreds of tables to migrate, we need to perform hundreds of ALTER TABLE statements.

An alternative option is to run the OPTIMIZE TABLE command. This also changes the row format of the table to the current default, but does so without setting Create_options;

When a ROW_FORMAT option is not specified explicitly, or when ROW_FORMAT=DEFAULT is used, an operation that rebuilds a table silently changes the row format of the table to the format defined by the innodb_default_row_format variable. (source)

The documentation for the OPTIMIZE TABLE command explains why this occurs;

For InnoDB tables, OPTIMIZE TABLE is mapped to ALTER TABLE ... FORCE, which rebuilds the table to update index statistics and free unused space in the clustered index. (source)

Another advantage to using OPTIMIZE TABLE is that it allows us to chain table names, meaning we can target all the tables that need to be updated using a single statement; OPTIMIZE TABLE t1, t2, t3, …, …, etc;

Once we’ve migrated all the legacy tables from the COMPACT to DYNAMIC row format, we can simply perform a switch-over of the Blue/Green deployment and we're finished!

Caveat: Custom Parameter Group

The value of read_only in the default parameter group is {trueIfReplica}. If you remember, this is something we needed to change to be able to run the ALTER TABLE queries on the Green environment.

Once we're finished, our production database is still using this custom parameter group. Switching the parameter group back to the default requires a reboot. This may be something you can schedule for a maintenance period, but it's less than ideal. Thankfully we can update the value of read_only on our custom parameter group back to the default. So we're still using the custom group, but the values are all the same as the default.