This section lists a number of miscellaneous tips for improving query processing speed:
Use persistent connections to the database to avoid
connection overhead. If you cannot use persistent
connections and you are initiating many new connections to
the database, you may want to change the value of the
thread_cache_size
variable.
See Section 7.5.3, “Tuning Server Parameters”.
Always check whether all your queries really use the indexes
that you have created in the tables. In MySQL, you can do
this with the EXPLAIN
statement. See Section 7.2.1, “Optimizing Queries with EXPLAIN
”.
Try to avoid complex SELECT
queries on MyISAM
tables that are updated
frequently, to avoid problems with table locking that occur
due to contention between readers and writers.
MyISAM
supports concurrent inserts: If a
table has no free blocks in the middle of the data file, you
can INSERT
new rows into it
at the same time that other threads are reading from the
table. If it is important to be able to do this, you should
consider using the table in ways that avoid deleting rows.
Another possibility is to run OPTIMIZE
TABLE
to defragment the table after you have
deleted a lot of rows from it. This behavior is altered by
setting the
concurrent_insert
variable.
You can force new rows to be appended (and therefore allow
concurrent inserts), even in tables that have deleted rows.
See Section 7.3.3, “Concurrent Inserts”.
MySQL Enterprise. For optimization tips geared to your specific circumstances subscribe to the MySQL Enterprise Monitor. For more information, see http://www.mysql.com/products/enterprise/advisors.html.
To fix any compression issues that may have occurred with
ARCHIVE
tables, you can use
OPTIMIZE TABLE
. See
Section 13.11, “The ARCHIVE
Storage Engine”.
Use ALTER TABLE ... ORDER BY
if you
usually retrieve rows in
expr1
,
expr2
, ...
order. By
using this option after extensive changes to the table, you
may be able to get higher performance.
expr1
,
expr2
, ...
In some cases, it may make sense to introduce a column that is “hashed” based on information from other columns. If this column is short, reasonably unique, and indexed, it may be much faster than a “wide” index on many columns. In MySQL, it is very easy to use this extra column:
SELECT * FROMtbl_name
WHEREhash_col
=MD5(CONCAT(col1
,col2
)) ANDcol1
='constant
' ANDcol2
='constant
';
For MyISAM
tables that change frequently,
you should try to avoid all variable-length columns
(VARCHAR
,
BLOB
, and
TEXT
). The table uses dynamic
row format if it includes even a single variable-length
column. See Chapter 13, Storage Engines.
It is normally not useful to split a table into different
tables just because the rows become large. In accessing a
row, the biggest performance hit is the disk seek needed to
find the first byte of the row. After finding the data, most
modern disks can read the entire row fast enough for most
applications. The only cases where splitting up a table
makes an appreciable difference is if it is a
MyISAM
table using dynamic row format
that you can change to a fixed row size, or if you very
often need to scan the table but do not need most of the
columns. See Chapter 13, Storage Engines.
If you often need to calculate results such as counts based on information from a lot of rows, it may be preferable to introduce a new table and update the counter in real time. An update of the following form is very fast:
UPDATEtbl_name
SETcount_col
=count_col
+1 WHEREkey_col
=constant
;
This is very important when you use MySQL storage engines
such as MyISAM
that has only table-level
locking (multiple readers with single writers). This also
gives better performance with most database systems, because
the row locking manager in this case has less to do.
If you need to collect statistics from large log tables, use summary tables instead of scanning the entire log table. Maintaining the summaries should be much faster than trying to calculate statistics “live.” Regenerating new summary tables from the logs when things change (depending on business decisions) is faster than changing the running application.
If possible, you should classify reports as “live” or as “statistical,” where data needed for statistical reports is created only from summary tables that are generated periodically from the live data.
Take advantage of the fact that columns have default values. Insert values explicitly only when the value to be inserted differs from the default. This reduces the parsing that MySQL must do and improves the insert speed.
In some cases, it is convenient to pack and store data into
a BLOB
column. In this case,
you must provide code in your application to pack and unpack
information, but this may save a lot of accesses at some
stage. This is practical when you have data that does not
conform well to a rows-and-columns table structure.
Normally, you should try to keep all data nonredundant (observing what is referred to in database theory as third normal form). However, there may be situations in which it can be advantageous to duplicate information or create summary tables to gain more speed.
Stored routines or UDFs (user-defined functions) may be a good way to gain performance for some tasks. See Section 18.2, “Using Stored Routines (Procedures and Functions)”, and Section 21.3, “Adding New Functions to MySQL”, for more information.
You can increase performance by caching queries or answers in your application and then executing many inserts or updates together. If your database system supports table locks, this should help to ensure that the index cache is only flushed once after all updates. You can also take advantage of MySQL's query cache to achieve similar results; see Section 7.5.5, “The MySQL Query Cache”.
Use INSERT DELAYED
when you
do not need to know when your data is written. This reduces
the overall insertion impact because many rows can be
written with a single disk write.
Use INSERT LOW_PRIORITY
when you want to
give SELECT
statements higher
priority than your inserts.
Use SELECT HIGH_PRIORITY
to get
retrievals that jump the queue. That is, the
SELECT
is executed even if
there is another client waiting to do a write.
LOW_PRIORITY
and
HIGH_PRIORITY
have an effect only for
storage engines that use only table-level locking
(MyISAM
, MEMORY
,
MERGE
).
Use multiple-row INSERT
statements to store many rows with one SQL statement. Many
SQL servers support this, including MySQL.
Use LOAD DATA
INFILE
to load large amounts of data. This is
faster than using INSERT
statements.
Use AUTO_INCREMENT
columns so that each
row in a table can be identified by a single unique value.
unique values.
Use OPTIMIZE TABLE
once in a
while to avoid fragmentation with dynamic-format
MyISAM
tables. See
Section 13.4.3, “MyISAM
Table Storage Formats”.
Use MEMORY
tables when possible to get
more speed. See Section 13.8, “The MEMORY
(HEAP
) Storage Engine”.
MEMORY
tables are useful for noncritical
data that is accessed often, such as information about the
last displayed banner for users who don't have cookies
enabled in their Web browser. User sessions are another
alternative available in many Web application environments
for handling volatile state data.
With Web servers, images and other binary assets should normally be stored as files. That is, store only a reference to the file rather than the file itself in the database. Most Web servers are better at caching files than database contents, so using files is generally faster.
Columns with identical information in different tables should be declared to have identical data types so that joins based on the corresponding columns will be faster.
Try to keep column names simple. For example, in a table
named customer
, use a column name of
name
instead of
customer_name
. To make your names
portable to other SQL servers, you should keep them shorter
than 18 characters.
If you need really high speed, you should take a look at the
low-level interfaces for data storage that the different SQL
servers support. For example, by accessing the MySQL
MyISAM
storage engine directly, you could
get a speed increase of two to five times compared to using
the SQL interface. To be able to do this, the data must be
on the same server as the application, and usually it should
only be accessed by one process (because external file
locking is really slow). One could eliminate these problems
by introducing low-level MyISAM
commands
in the MySQL server (this could be one easy way to get more
performance if needed). By carefully designing the database
interface, it should be quite easy to support this type of
optimization.
If you are using numerical data, it is faster in many cases to access information from a database (using a live connection) than to access a text file. Information in the database is likely to be stored in a more compact format than in the text file, so accessing it involves fewer disk accesses. You also save code in your application because you need not parse your text files to find line and column boundaries.
Replication can provide a performance benefit for some operations. You can distribute client retrievals among replication servers to split up the load. To avoid slowing down the master while making backups, you can make backups using a slave server. See Chapter 16, Replication.
Declaring a MyISAM
table with the
DELAY_KEY_WRITE=1
table option makes
index updates faster because they are not flushed to disk
until the table is closed. The downside is that if something
kills the server while such a table is open, you should
ensure that the table is okay by running the server with the
--myisam-recover
option, or
by running myisamchk before restarting
the server. (However, even in this case, you should not lose
anything by using DELAY_KEY_WRITE
,
because the key information can always be generated from the
data rows.)
User Comments
I've been using MySQL 5.0.85-Community for a while now on Debian Linux. The MySQL package that is installed with Debian by default includes the InnoDB engine, but in most cases, you wouldn't need InnoDB and disabling the storage engine can save you a lot of memory optimizing your overall database performance when RAM size is a constraint.
For example on the host http://www.sytru.com I've been running using Debian 5.0 Lenny and MySQL 5.0.85-Community, the InnoDB engine took around 100MB of memory even at idle times. Wasting 100 of RAM on a feature that is not used at all would cause major slowdowns if your machine has little RAM installed. So, If you don't need the InnoDB engine enabled, it's recommended to turn it off, to free up some memory and get extra optimization on machines with low memory.
To disable to InnoDB storage engine edit your my.cnf configuration file (usually in /etc/mysql/) and add the following line to it:
skip-innodb
Save and close the file, restart your database server and you're done. You can also re-compile the package from scratch, removing the InnoDB storage engine completely, but I don't recommend this just in case you need to re-enable InnoDB for some reason in the future.
Add your own comment.