Now and then, smaller, or larger problems arise during consulting missions involving MySQL, which we would like to report here. In this case, our customer wanted to replace a MySQL single instance with a MySQL InnoDB cluster with three nodes. We have reported about the cluster here several times and do not discuss it in detail here.
In this specific case, there was a supposedly simple issue. An essential requirement for the implementation of the cluster was simply not fulfilled: There were tables without a primary key!
Good preconditions …?
The MySQL InnoDB Cluster is based on the so-called “Group Replication”. To use this feature, all tables must use the InnoDB storage engine and have a primary key. Further requirements for using “Group Replication” or the InnoDB Cluster can be found here.
In addition to the actual database server, which was already in operation, the customer had provided three further nodes, which should be combined to form a cluster.
In our test environment, we recreated the problem in a highly simplified way with four Docker containers. The initial node (“legacy system” with the name Berlin) contains a database with the name “application”, which has a table “names”. The other three containers (Hamburg, Cologne, Frankfurt) represent the new cluster.
Here is a brief overview of the used data model:
As you can easily see, the table “names” has only two columns of the type “varchar” but no primary key.
Poor conditions …?
Even before the actual initialization of the cluster, the errors could be “tracked down”. At this point, an imported version of the “application” database was already in one of the nodes of the cluster to be built (Hamburg). By means of the following test routine, it should be verified in advance (before building the cluster) whether the system(s) would be suitable for building a cluster at all. Already in the first step (calling the check routine on the first node) it turned out that we have a problem:
A primary key is missing on the table “names”! Unfortunately, this error occurred not only once, but several times with our customer.
How to help …?
Of course, it is not difficult to add a (!) primary key. However, we did not want to do this manually. In addition, one must consider that adding a new column to various tables can cause problems within the application. It could be that a developer uses a “select * from tabX” at one point or another and expects a certain number of result columns. If an additional column, which is used as a primary key, suddenly appears, this could lead to an error. We decided to use the new function “Generated Invisible Primary Keys”, which is available from version 8.0.30 onwards. But before we continue with the customer example, a short excursion on what this parameter changes on a database system.
Ultimately, the setting ensures that tables that are created without an explicit primary key are forced to have one. This primary key is always called: “my_row_id”.
Here is a small case study:
As can be seen, the new “feature” is not used by default. Let's create a small table “tab1” for the test without activating this configuration.
Tables that are created without a primary key, as described above, will not have one!
After activating this function, this behavior changes:
The table has now automatically been given the primary key “my_row_id”.
Well hidden …?
But what about the visibility of the primary key? The additional column with the primary key can be recognized via the instruction “show create table”. But what about the above-mentioned “select * from” variant that a developer may have used?
Ok, the column is not displayed. But is this always the case? Maybe the application tries to validate the data model via other mechanisms:
- “desc table”
- System tables
Let's check this, too:
This way we get a “proof of existence” for the column and the primary key. Of course, there is also a small “tweak” here to “adapt” this behavior of the database:
As we see, we see nothing!
A sound plan …?
The plan (we will not explain all steps here) for the customer to build the cluster is as follows:
1. set up the new cluster with the nodes Hamburg, Cologne, Frankfurt.
2. activate the function “Generated Invisible Primary Keys” in the cluster.
3. setting up a replication from the old system “Berlin” into the cluster (to the primary node “Hamburg”).
4. backup and restore of the database.
5. Backup and restore the database “application” into the cluster.
6. starting the replication
7. switch from the old system to the cluster.
According to the plan, the problem of the missing primary key should take care of itself with the third step. All tables that are created with the restore of the dump without primary keys now should get one by default.
After setting up the cluster (step 1), which we will skip at this point, the corresponding parameter is activated:
Replication between the primary cluster (“Hamburg” node) and the legacy system (“Berlin”) has also been set up (step 3), without describing this in detail here. However, replication is not yet active.
With the import of the backup from node “Berlin” (legacy system) into the cluster (node Hamburg), the tables are newly created. Due to the above configuration, the missing primary keys are created.
Let's take a look at the result (just to be sure). How was the table “names” created in the database “application” in the cluster?
The missing primary key exists.
The plan works …
The plan could therefore be implemented successfully. The MySQL dump from the old system was successfully imported into the cluster. All missing primary keys were compensated via the generated keys. If this had not worked, a problem would have arisen at this point, because, as we learned at the beginning, tables without primary keys cannot be imported into a cluster. Certainly, the customer's issue could have been solved differently. The new “feature” eliminated a lot of manual work at this point and brought us efficiently to our goal.
Do you have questions about MySQL? Then talk to us!