How do you tune database performance?
Every good Database Administrator (DBA) knows that tuning a database is a constant job. This is because there is always something in the database that can be changed to make it run more efficiently. But, what are some of the more common things that you should keep an eye out for if you are tuning an RDBMS? Well, let’s go through those things, but without getting too specific for any particular RDBMS (like MySQL, Oracle, etc.).
Tune database performance by designing tables efficiently
How you design your tables is very important in terms of database performance.
In general, non-RDBMS specific terms, here are some guidelines you should follow when designing tables:
- Be careful using triggers. Using triggers on specific database tables can be a nice way to come up with a solution for a specific problem. For example, if you want to enforce a difficult constraint, triggers can be a good solution. But, on the other hand, when a trigger is fired (or “triggered”) by a given SQL statement, then all the work that trigger does becomes a part of the SQL statement. This extra work can slow down the SQL statement, and sometimes that can cause a big hit on performance if the trigger fires a lot.
- Use the same data type for both primary and foreign key columns. This means that when defining a primary and foreign key, it’s not a good idea to define the primary key with a VARCHAR data type and a foreign key with a CHAR data type – it’s better to define both keys as either VARCHAR or CHAR. This is because table joins when the data types are different, then the DBMS will have to convert one of the data types to the same type as the other one. This process of conversion from one data type to another is extra work that the RDBMS, and of course slows down any joins that are done between tables with columns that have different data types.
- When choosing a data type for a numeric column, you should always pick the smallest data type that can hold your data. Choosing the smallest data type that still fits your data can save a lot of space. For example, if you choose to use the BIGINT data type when all of your data for a given column could actually fit in the TINYINT or SMALLINT data types would be a huge waste of space. Think of it this way – if your table has millions of rows then it would result in millions of entries for that BIGINT column which would have too much extra space inside them. So, the unused and wasted space could add up very fast.
- Another thing to consider is to use CHAR instead of VARCHAR when you know that the entries will have the same length. VARCHAR columns actually have an extra 1 to 3 bytes per entry so that they can also hold the length of the actual data. In addition to that extra 1 to 3 bytes, whenever a VARCHAR column has some change in the column data, there is some more processing overhead because the length of the column data has to be calculated yet again.
Tune database performance by reducing disk writes and reads
Reading and writing from the disk (as in the actual hard drive on which the database is stored) is extremely expensive in terms of database performance. Operations that read and write to and from the disk are also known as I/O operations. Anyways, you have to do the best you can with the available memory (the RAM – not to be confused with disk space), so that you reduce the number of disk reads and writes to the absolute minimum necessary. Here are some approaches to reducing the number of I/O operations, and also to minimize the time spent waiting for those operations to finish:
- One possibility to help tune database performance by reducing the number of I/O operations is to use more than just one physical disk drive. By having multiple disk drives, you can spread out the database files on the different disk drives. By spreading out the database files over multiple drives, you can have many I/O operations happening in parallel, which can be a big performance advantage.
-
Another important approach to correctly tuning your database for better performance is to make sure that you allocate buffers which have the right size. What exactly is a buffer? Well, it’s a part of the memory (the RAM) that is used to temporarily store data that has recently been retrieved from the database and also temporarily hold data that will eventually be stored in the permanent storage (like a disk). If you have a buffer of the right size, then data that has already been read from the database will remain in the buffer for a fair amount of time, and that means that if a new query is run asking for the same data, then it can be retrieved directly from the buffer rather than making another expensive (performance-wise) query into the database.
Buffers can also be used to store data that is written to the buffer by the RDBMS for temporary storage. Then, this data can later be copied to permanent storage at some point in the future. This is also known as asynchronous I/O because the data is not written immediately from the RDBMS to the permanent storage – all major RDBMS’s have some form of it, since it’s quite important for performance.
Creating indexes help greatly in tuning database performance
Database indexes are a major performance enhancer in databases. You can read more about some best practices to follow for database indexes here:
SQL Index best practices.