10 Minuten Lesezeit (1912 Worte)

Make yourself invisible: MySQL Generated Invisible Primary Keys

Now and then, smaller, or larger problems arise during consulting missions involving MySQL, which we would like to report here. In this case, our customer wanted to replace a MySQL single instance with a MySQL InnoDB cluster with three nodes. We have reported about the cluster here several times and do not discuss it in detail here.

In this specific case, there was a supposedly simple issue. An essential requirement for the implementation of the cluster was simply not fulfilled: There were tables without a primary key!

Good preconditions …?

The MySQL InnoDB Cluster is based on the so-called “Group Replication”. To use this feature, all tables must use the InnoDB storage engine and have a primary key. Further requirements for using “Group Replication” or the InnoDB Cluster can be found here.

In addition to the actual database server, which was already in operation, the customer had provided three further nodes, which should be combined to form a cluster.

In our test environment, we recreated the problem in a highly simplified way with four Docker containers. The initial node (“legacy system” with the name Berlin) contains a database with the name “application”, which has a table “names”. The other three containers (Hamburg, Cologne, Frankfurt) represent the new cluster.

Here is a brief overview of the used data model:

MySQL  berlin:33060+ ssl  JS > show tables;
+-----------------------+
| Tables_in_application |
+-----------------------+
| names                 |
+-----------------------+
1 row in set (0.0019 sec)

MySQL  berlin:33060+ ssl  JS > desc names;
+-----------+-------------+------+-----+---------+-------+
| Field     | Type        | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| firstname | varchar(20) | YES  |     | NULL    |       |
| lastname  | varchar(20) | YES  |     | NULL    |       |
+-----------+-------------+------+-----+---------+-------+
2 rows in set (0.0087 sec)
 

As you can easily see, the table “names” has only two columns of the type “varchar” but no primary key.

Poor conditions …?

Even before the actual initialization of the cluster, the errors could be “tracked down”. At this point, an imported version of the “application” database was already in one of the nodes of the cluster to be built (Hamburg). By means of the following test routine, it should be verified in advance (before building the cluster) whether the system(s) would be suitable for building a cluster at all. Already in the first step (calling the check routine on the first node) it turned out that we have a problem:

MySQL  hamburg:33060+ ssl  JS > dba.checkInstanceConfiguration()
Validating local MySQL instance listening at port 3306 for use in an InnoDB cluster...

This instance reports its own address as hamburg:3306
Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.

Checking whether existing tables comply with Group Replication requirements...
ERROR: The following tables do not have a Primary Key or equivalent column:
application.names

Group Replication requires tables to use InnoDB and have a PRIMARY KEY or PRIMARY KEY Equivalent (non-null unique key). Tables that do not follow these requirements will be readable but not updateable when used with Group Replication. If your applications make updates (INSERT, UPDATE or DELETE) to these tables, ensure they use the InnoDB storage engine and have a PRIMARY KEY or PRIMARY KEY Equivalent.
If you can't change the tables structure to include an extra visible key to be used as PRIMARY KEY, you can make use of the INVISIBLE COLUMN feature available since 8.0.23: https://dev.mysql.com/doc/refman/8.0/en/invisible-columns.html

Checking instance configuration...

{ "status": "error" }
 

A primary key is missing on the table “names”! Unfortunately, this error occurred not only once, but several times with our customer.

How to help …?

Of course, it is not difficult to add a (!) primary key. However, we did not want to do this manually. In addition, one must consider that adding a new column to various tables can cause problems within the application. It could be that a developer uses a “select * from tabX” at one point or another and expects a certain number of result columns. If an additional column, which is used as a primary key, suddenly appears, this could lead to an error. We decided to use the new function “Generated Invisible Primary Keys”, which is available from version 8.0.30 onwards. But before we continue with the customer example, a short excursion on what this parameter changes on a database system.

Ultimately, the setting ensures that tables that are created without an explicit primary key are forced to have one. This primary key is always called: “my_row_id”.

Here is a small case study: 

mysql> SELECT @@sql_generate_invisible_primary_key;
+--------------------------------------+
| @@sql_generate_invisible_primary_key |
+--------------------------------------+
|                                    0 |
+--------------------------------------+
1 row in set (0.01 sec)
 

As can be seen, the new “feature” is not used by default. Let's create a small table “tab1” for the test without activating this configuration.

mysql> create table tab1 ( a varchar(10));
Query OK, 0 rows affected (0.03 sec)

mysql> show create table tab1;
+-------+----------------------------------------------------------------+
| Table | Create Table                                                                                                            
+-------+----------------------------------------------------------------+
| tab1  | CREATE TABLE `tab1` (
  `a` varchar(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+----------------------------------------------------------------+
1 row in set (0.00 sec)
 

Tables that are created without a primary key, as described above, will not have one!

After activating this function, this behavior changes: 

mysql> set sql_generate_invisible_primary_key=1;
Query OK, 0 rows affected (0.00 sec)

mysql> create table tab2 ( a varchar(10));
Query OK, 0 rows affected (0.04 sec)

mysql> show create table tab2;
+-------+----------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                      
+-------+----------------------------------------------------------------+
| tab2  | CREATE TABLE `tab2` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `a` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+----------------------------------------------------------------+
1 row in set (0.00 sec)
 

The table has now automatically been given the primary key “my_row_id”.

Well hidden …?

But what about the visibility of the primary key? The additional column with the primary key can be recognized via the instruction “show create table”. But what about the above-mentioned “select * from” variant that a developer may have used?

mysql> select * from names;
+-----------+----------+
| firstname | lastname |
+-----------+----------+
| Matthias  | Jung     |
+-----------+----------+
1 row in set (0.00 sec)
 

Ok, the column is not displayed. But is this always the case? Maybe the application tries to validate the data model via other mechanisms:

  • “desc table”
  • System tables

Let's check this, too:

mysql> desc names;
+-----------+-----------------+------+-----+---------+--------------------------+
| Field     | Type            | Null | Key | Default | Extra                    |
+-----------+-----------------+------+-----+---------+--------------------------+
| my_row_id | bigint unsigned | NO   | PRI | NULL    | auto_increment INVISIBLE |
| firstname | varchar(20)     | YES  |     | NULL    |                          |
| lastname  | varchar(20)     | YES  |     | NULL    |                          |
+-----------+-----------------+------+-----+---------+--------------------------+
3 rows in set (0.01 sec)

mysql> select COLUMN_NAME, DATA_TYPE, COLUMN_TYPE,EXTRA,COLUMN_COMMENT from information_schema.columns where table_schema='application' and table_name = 'names';
+-------------+-----------+-----------------+--------------------------+----------------+
| COLUMN_NAME | DATA_TYPE | COLUMN_TYPE     | EXTRA                    | COLUMN_COMMENT |
+-------------+-----------+-----------------+--------------------------+----------------+
| firstname   | varchar   | varchar(20)     |                          |                |
| lastname    | varchar   | varchar(20)     |                          |                |
| my_row_id   | bigint    | bigint unsigned | auto_increment INVISIBLE |                |
+-------------+-----------+-----------------+--------------------------+----------------+
3 rows in set (0.00 sec)
 

This way we get a “proof of existence” for the column and the primary key. Of course, there is also a small “tweak” here to “adapt” this behavior of the database:

mysql> set global show_gipk_in_create_table_and_information_schema=off;
Query OK, 0 rows affected (0.00 sec)

Welchen Einfluss hat dies auf die Sichtbarkeit bei:
„desc table“
Systemtabellen
„show create table“

mysql> desc names;
+-----------+-------------+------+-----+---------+-------+
| Field     | Type        | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| firstname | varchar(20) | YES  |     | NULL    |       |
| lastname  | varchar(20) | YES  |     | NULL    |       |
+-----------+-------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

mysql> show create table names;
+--------------------------------------------------------------+
| Table | Create Table                                         |
+--------------------------------------------------------------+
| names | CREATE TABLE `names` (
  `firstname` varchar(20) DEFAULT NULL,
  `lastname` varchar(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 
+--------------------------------------------------------------+

mysql> select COLUMN_NAME, DATA_TYPE, COLUMN_TYPE,EXTRA,COLUMN_COMMENT from information_schema.columns where table_schema='application' and table_name = 'names';
+-------------+-----------+-------------+-------+----------------+
| COLUMN_NAME | DATA_TYPE | COLUMN_TYPE | EXTRA | COLUMN_COMMENT |
+-------------+-----------+-------------+-------+----------------+
| firstname   | varchar   | varchar(20) |       |                |
| lastname    | varchar   | varchar(20) |       |                |
+-------------+-----------+-------------+-------+----------------+
2 rows in set (0.01 sec)
 

As we see, we see nothing!

A sound plan …?

The plan (we will not explain all steps here) for the customer to build the cluster is as follows:

1. set up the new cluster with the nodes Hamburg, Cologne, Frankfurt.
2. activate the function “Generated Invisible Primary Keys” in the cluster.
3. setting up a replication from the old system “Berlin” into the cluster (to the primary node “Hamburg”). 
4. backup and restore of the database.
5. Backup and restore the database “application” into the cluster.
6. starting the replication
7. switch from the old system to the cluster. 

MySQL  hamburg:33060+ ssl  JS > \sql set persist sql_generate_invisible_primary_key= 1;
Fetching global names for auto-completion... Press ^C to stop.
Query OK, 0 rows affected (0.0017 sec)
 

According to the plan, the problem of the missing primary key should take care of itself with the third step. All tables that are created with the restore of the dump without primary keys now should get one by default.

After setting up the cluster (step 1), which we will skip at this point, the corresponding parameter is activated: 

MySQL  hamburg:33060+ ssl  JS > \sql set persist sql_generate_invisible_primary_key= 1;
Fetching global names for auto-completion... Press ^C to stop.
Query OK, 0 rows affected (0.0017 sec)
 

Replication between the primary cluster (“Hamburg” node) and the legacy system (“Berlin”) has also been set up (step 3), without describing this in detail here. However, replication is not yet active.

With the import of the backup from node “Berlin” (legacy system) into the cluster (node Hamburg), the tables are newly created. Due to the above configuration, the missing primary keys are created.

bash-4.4# mysqldump --host=berlin --user=root --password=root 
--opt --source-data=2 --all-databases | mysql --host=hamburg --
user=root --password=root 

Let's take a look at the result (just to be sure). How was the table “names” created in the database “application” in the cluster? 

bash-4.4# mysqlsh --database=application --sql
Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
MySQL Shell 8.0.32

Copyright (c) 2016, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

 MySQL  hamburg  application  SQL > show create table names;
+-------+-------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                       
+-------+-------------------------------------------------------------------------------------------+

| names | CREATE TABLE `names` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `firstname` varchar(20) DEFAULT NULL,
  `lastname` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+-------------------------------------------------------------------------------------------+
1 row in set (0.0022 sec)
 

The missing primary key exists.

The plan works …

The plan could therefore be implemented successfully. The MySQL dump from the old system was successfully imported into the cluster. All missing primary keys were compensated via the generated keys. If this had not worked, a problem would have arisen at this point, because, as we learned at the beginning, tables without primary keys cannot be imported into a cluster. Certainly, the customer's issue could have been solved differently. The new “feature” eliminated a lot of manual work at this point and brought us efficiently to our goal.

Do you have questions about MySQL? Then talk to us! 

Principal Consultant bei ORDIX

 

Kommentare

Derzeit gibt es keine Kommentare. Schreibe den ersten Kommentar!
Sonntag, 08. September 2024

Sicherheitscode (Captcha)

×
Informiert bleiben!

Bei Updates im Blog, informieren wir per E-Mail.

Weitere Artikel in der Kategorie