Von ORDIX AG auf Freitag, 03. Juni 2022
Kategorie: Oracle

MySQL HeatWave - Speed up your OLAP tasks

A few months ago, Oracle published the new MySQL HeatWave feature for the Oracle Cloud Infrastructure (OCI), which promised to speed up analysis and transactional queries on your MySQL database without big effort.

By using so called "HeatWave-Engines" (or "RAPID-Engines") in addition to the original MySQL instance, customers will be able to use OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) workloads directly in one MySQL database.

To benefit from this feature no single SQL needs to be adjusted to optimize the performance.

The database system will be expanded by a HeatWave plugin which is responsible to receive the original SQL queries and proceed the statements together with a HeatWave cluster.

Beside the better performance, the customer will not feel any disadvantage.

To make the HeatWave plugin work, the tables used by the SQL query need to be enhanced by secondary engine: The HeatWave plugin. Afterwards the data needs to be loaded into this second engine:

alter table lineitem secondary_engine=rapid;

alter table lineitem secondary_load;

As soon as the secondary engine for a table is enabled, the data will be locked to avoid any DML statements.

The above listed example which loads table data into the secondary engine is simplified. If the data model is more complex, there are multiple steps which should or must be checked in advanced e.g., you need to check if enough memory is available in the HeatWave cluster to load the table data.

There are also some restrictions regarding the table definitions – at the end, your data and data model needs to be compatible with the HeatWave plugin.

If you do not want to think about loading all used tables into the secondary engine on your own, you can use provided system calls by oracle (https://dev.mysql.com/doc/heatwave/en/auto-parallel-load.html):

sys.heatwave_load (db_list,[options]);

I will go through such restrictions in a future article. For now, I will simply show you the benefit of HeatWave by using a compatible data model.

TPC-H Benchmark 

To create a comparable basis, I used the predefined queries from the TPC-H benchmarks (http://www.tpc.org/tpch/). With these data model I created multiple databases (3, 9, 18 and 36 GB). The TPC-H queries will be executed and summarized in the following examples with and without using the HeatWave engine​.

To prepare the first example I will use the 3GB database called tpc_3 and the third query of the TPC-H benchmark. Therefore, I will load all database tables into the HeatWave engine:

​Now the whole database is loaded into the HeatWave engine.

​To compare the performance with and without HeatWave engine I enabled profiling and run the third query:

 → | 4 | 7.0137665

The query was finished without HeatWave engine in 7 seconds. Now let us run the same query after enabling the secondary engine:

 → | 15 | 0.7033645

By enabling the secondary engine, the query was finished within 0,7 seconds. By looking at the execution plan you can see a hint that RAPID as secondary engine was used:

Using where; Using temporary; Using filesort; Using secondary engine RAPID

​To increase the complexity of my tests a little bit, I will now compare all TPC-H queries and run all of them on the different database sizes. This led to the following result:

​The difference becomes particularly clear by looking on query 9: The database needs 114 seconds to run the query without a secondary engine and only 0,26 seconds when using RAPID.

Keep in mind that my biggest database with 38 GB is not big. All data can simply be loaded into the database buffer which already speeds up the performance without using a secondary engine.

The difference will become bigger and bigger by increasing the size of the used data.

​DB Größe (in GB) ​Query ​InnoDB (in s) ​RAPID (in s)​InnoDB/RAPID
​3 ​3 ​7,12 ​0,059​120-fache
​9​17,407​0,073​238-fache
​9​3​50,98​0,107​476-fache
​9​364,87​0,825​442-fache
​18​3​65,03​0,197​330-fache
​9​684,85​4,548​150-fache
​36​3​250,12​0,384​651-fache
​9​-​-​-

​Just to point it out: by linearly increasing the amount of data (9 – 18 – 36 GB) the runtime of the RAPID engine is also increasing linearly (0,107 – 0,197 – 0,384 seconds) while the consumption of time used by the InnoDB engine is starting to increase exponential (50,98 – 65,03 – 250,12 seconds).

By using the HeatWave feature you will be able to increase the performance of typical OLTP/OLTP queries based on big data in comparison to the classic InnoDB engine. For sure it is a matter of database configuration (e.g., size of the database buffer) by which database size the advantage will be bigger than the effort to enable a secondary engine.

The benefit of using the HeatWave engine is obvious: if you know that OLTP/OLAP queries will run against your database on a regular basis, you can simply enable a secondary engine during that time window and speed up the performance. Afterwards, you can go ahead and disable the secondary engine again, so that all locks on the tables will be removed as a consequence.

For databases which are already running in the OCI it will not increase the costs significant, because you need the additional HeatWave resources temporary.

Unfortunately, the fact that the HeatWave engine is currently only available in the OCI, the costs/efforts for customers to put their database into the OCI initially is for sure bigger. There are several options to use the HeatWave feature for En-Premise databases (e.g., migration, replication ..).

If you are already running your DWH MySQL database in the OCI, you should give HeatWave a try.

Kommentare hinterlassen