Get the Best Out of MariaDB with Performance Tuning

1
42442

 

MariaDB is one of the most popular database servers made by the original developers of MySQL, and has a strong developer and user community. Rugged as MariaDB is, getting the best out of it requires performance tuning. Read how to install MariaDB and tune it for better performance.

MySQL is one of the most popular open source databases in the world of technology. Earlier, MySQL was an independent company, which offered open source as well as commercial databases. Later, Sun Microsystems acquired MySQL, and then Oracle acquired Sun. When Oracle acquired Sun, the original MySQL developers forked a project called MariaDB, which was supported by a company called SkySQL. They took this step because they had fears about MySQL becoming closed source after its acquisition by Oracle.

After this whole saga, many Linux distributions started packaging MariaDB as the default MySQL server. Some distributions provide both MySQL (Oracle) and MariaDB, and leave the choice to the users.

Just like any other open source project, MariaDB is ahead of MySQL in many aspects; it has some enhanced features and usually goes through a faster release cycle – it brings out new features and bug fixes faster than MySQL.

MariaDB installation on Ubuntu 16.04 server
As of now, MariaDB maintains two versions, which are suitable for and usable in a production environment. These are 10.0 and 10.1. The 10.0 version is MySQL 5.5 with backported features from MySQL 5.6, whereas 10.1 is the current supported release by the MariaDB team. It has a lot of new features such as Galera cluster and performance improvements. Ubuntu 16.04 repositories have the 10.0 version, which is fine if you are not interested in the new features. To install 10.1, you have to add MariaDB’s repository to your aptitude sources.list file, which is available at the repositories page located at https://downloads.mariadb.org/mariadb/repositories/.

I am choosing the DigitalOcean mirror in my example below:

sudo apt-get install software-properties-common
sudo apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xF1656F24C74CD1D8
sudo add-apt-repository 'deb [arch=amd64,i386,ppc64el] http://lon1.mirrors.digitalocean.com/mariadb/repo/10.1/ubuntu xenial main'

After adding the repository, fire this command to install the MariaDB server:

sudo apt update
sudo apt install mariadb-server

Aptitude will download and install the server for you, and will ask for the server’s root password during set-up. Provide the same and keep it secure.

The storage engines available in MariaDB 10.1
Storage engines form the most important part of MySQL and MariaDB. They decide how and where the data will be stored and provide features for data management. There are many storage engines supported by MariaDB, but we will discuss only the ones that are most commonly used.

Transactional engines: InnoDB, which originally came with MySQL, is the transactional engine by default for MySQL. MariaDB has a fork known as XtraDB as the default transactional engine; XtraDB is developed and maintained by Percona. XtraDB is also available as part of the Percona server. InnoDB/XtraDB is the most used ACID-compliant engine. It is safe and good for most use cases.

InnoDB is not known to be easy to backup. If you are using XtraDB, there is a tool from Percona known as xtrabackup that can be used for backing up XtraDB data. The use of this tool is related to the way data is stored by the engine. One can always use the good old mysqldump utility but it strains the CPU, as every row has to be translated into proper SQL statements so that it can be reloaded at the backup server – and the SQL statements have to be translated back to real data and also recreate any indices. On small size databases mysqldump is sufficient but for larger ones it is better to use xtrabackup. Replication to another server and taking backups there is better, too.
There is yet another ACID-compliant transactional engine present in MariaDB, called TokuDB. It is designed for storing large amounts of data in smaller spaces – up to 25x compression can be achieved. The typical use cases for TokuDB are applications where a lot of data is queried and updated at the same time and in environments where the data that is being worked upon at a given instant cannot fit in the RAM.

Non-transactional engines: MyISAM is the oldest engine in MySQL. MariaDB has an improved version – Aria. It has many performance related fixes and better safety, which is lacking in MyISAM. One disadvantage of MyISAM is that it locks the table when there is an update or insert going on – MyISAM does not support MVCC while Aria has it in its roadmap but is not there yet.
There are two other interesting storage engines – SphinxSE and MEMORY. SphinxSE can be used when full text search is needed – it relies on the external Sphinx search service. The MEMORY engine is useful for storing data purely in the RAM – there is no persistence of the data to disk and it will be lost when the server is restarted. Throwaway cache data is an ideal candidate to be stored in the MEMORY engine.
There are more storage engines supported by MariaDB such as Federated and Cassandra, which are not covered here. Information about the same is available at mariadb.com.

Performance tuning
Database performance tuning is a vast topic and depends highly on the workload of the application. Yet, there are some common principles that can be applied to every case. To begin with, check the hardware of your server. Databases require a lot of RAM and fast disk IO. The more the RAM and the faster the disk, the happier your database server is. Tools such as PHPMyAdmin (Web application) and mysqltuner (command line application) provide insights on server performance and suggested parameter tuning to achieve better performance.
It is highly recommended to have storage such as SSD in RAID 10 mode for achieving high throughput with a large database system. Just switching from a traditional rotational hard drive to SSD can improve the response times of the server significantly.
To achieve good throughput from a database, it is very important to design the database schema and use optimal queries. For example, if you have a table with a user’s details such as name, email address, user name and you are running a query which searches for the user name without having an index on the user name field, then the performance will be bad because the database server has to do a full table scan for every such query.
A full table scan is one in which all the rows have to be scanned before arriving at a result. Badly designed queries, especially joins, can cause severe performance hits. MySQL and MariaDB use the nested loop algorithm for computing the result of joins. A nested loop algorithm works something like this: let us say you have three tables (t1, t2, t3) for which a combined result is required. First, the optimiser decides the driving table, and the next table in the join is queried based on results received from the driving table. This result set forms the base, using which the third table’s data is matched. In mathematical terms, this is known as a Cartesian product.

Use caching
MySQL and MariaDB have a query cache, which can provide a significant boost in performance. It is useful in cases with less writes and mostly reads. If there are a lot of writes happening, then the server will be spending more time managing the cache instead of working on queries. For this reason it is not recommended to have a large size for the built-in MySQL query cache. Up to 512MB is sufficient. For more control the application should use its own caching such as Memcached or Redis.

Use EXPLAIN
When designing queries for the system, it is important to have optimal queries. MySQL will provide details of the execution plan of a query given to EXPLAIN. Optimizer trace is also available, which can be used to explore why one plan was chosen instead of the other. Optimizer trace is used by MySQL internally, but can be used while designing queries as well.
To use optimizer trace, type:

mysql> SET optimizer_trace=”enabled=on”

Setting correct memory parameters
MySQL uses temporary tables a lot when processing complex queries that involve joins and sorting. The default size of a temporary table is very small and is not sufficient for bigger data sets. To set a temporary table size, add the following to your my.cnf:

tmp-table-size = 1G
max-heap-table-size = 1G

It is important to note that setting larger values for temporary tables means more RAM will be consumed. If the system does not have sufficient RAM, then the values should not be increased. Mysqltuner can provide information on whether the current settings of the server will overrun the available RAM or not. A good RAM usage figure is 80-90 per cent. Beyond 90 per cent implies operating on the borderline and can cause failure, assuming the database is the only service on the server. If the same machine is used for other services such as the Web, then that needs to be taken care of as well. However, it is a bad idea to run the Web and database on the same server in large workloads.

Buffer sizes
Two important entities that can be tuned are often missed, namely, join buffer size and sort buffer size. These buffers are allocated per connection and play a significant role in the performance of the system. Join buffer is used, as the name suggests, to process joins – but only full joins on which no keys are possible.
Sort buffer size is used to sort data. If your application involves a lot of sorting, then this should be increased. The system status variable sort_merge_passes will tell you whether the value needs to be increased or not. This variable should be as low as possible.

InnoDB settings
As stated earlier, the more the RAM, the better the performance. The performance of an InnoDB system is directly dependent on the size of the data that is present and the buffer pool size. A thumb rule is that the size of the buffer pool should be at least as large as the data that is being stored – this is not feasible in all cases and not required either, since not all the data present in the database may be required at the same time. But the buffer pool size should be large enough to accommodate data that is worked upon frequently.
The InnoDB log file is used for storing a redo log, which can be replayed in case of a power failure or database crash. If there are a lot of writes in the application, then the size of the log file must be increased. The optimal size of the log file can be calculated by observing the status variable innodb_os_log_written. It is a counter (bytes) which gets incremented for every byte written to the log file. So the difference between the values of the variable in a given time period (60 seconds) gives the optimal log file size to store exactly one minute worth of recovery data. To store longer periods of redo log, the size can be a multiple of the unit time interval. The status variable innodb_log_waits should be checked as well if the log size really needs tuning. In many set-ups the log size is unnecessarily large, which increases the time required for recovery when there is a crash or power failure.
Most of the servers these days have multiple CPUs (physically, or as SMP). If the InnoDB buffer pool size is more than a gigabyte then it can be divided across the number of CPUs using the setting innodb-buffer-pool-instances. The ideal value for the number of buffer pool instances is the number of CPUs you have on the server, and it is important to note that each instance must be at least 1 GB.

File system tuning
The file system on which database data is stored is an important aspect of performance tuning. Most Linux servers use ext4 or xfs but without one important setting being enabled – noatime. Whenever a file is read, the access time of the file is updated. This is of little utility and disabling it will yield a lot of performance benefit. Additionally, it is possible to use a raw disk as storage in case of InnoDB – the whole InnoDB data file can be stored over a raw disk partition. This mode is good if you have a dedicated disk and can help to avoid double buffering by the file system as well as MySQL.
As of Ubuntu 16.04, it is possible to run ZFS on Linux. Linux file systems by default use the LRU (Least Recently Used) algorithm for file system page caching – the oldest page is evicted when memory is needed. In contrast, ZFS has ARC – Adaptive Replacement Cache, which uses a page replacement algorithm that takes into account both recently read blocks as well as frequently read blocks— basically a combination of LRU and LFU (Least Frequently Used). This does help to achieve some performance benefit, but it depends on the workload. In addition to ARC, it also supports L2ARC – an ARC that can be stored on a memory device which is slightly slower than RAM, such as NVMe or SSD. Storing InnoDB with default ZFS settings will not provide optimal performance – the record size of InnoDB storage should be 16k and that of InnoDB log storage should be 128k. In case of the innodb-file-per-table flag (which is ‘on’ by default), the InnoDB data files are created inside individual database directories instead of being stored in a single large file. The whole MySQL dataset needs to have a record size of 16k in that case.
This covers relatively a large part about MySQL and MariaDB performance tuning; the rest depends on the application workload. I have successfully configured servers for 1k to 5.5k queries per second on an average, using these strategies.

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here