MySQL :: MySQL 5.4 Reference Manual :: 7.2.22 Speed of INSERT Statements

MySQL 5.4 Reference Manual :: 7 Optimization :: 7.2 Optimizing SELECT and Other Statements :: 7.2.22 Speed of INSERT Statements

« 7.2.21 INFORMATION_SCHEMA Optimization

7.2.23 Speed of UPDATE Statements »

Section Navigation [Toggle]

7.2.22. Speed of `INSERT` Statements

The time required for inserting a row is determined by the following factors, where the numbers indicate approximate proportions:

Connecting: (3)
Sending query to server: (2)
Parsing query: (2)
Inserting row: (1 × size of row)
Inserting indexes: (1 × number of indexes)
Closing: (1)

This does not take into consideration the initial overhead to open tables, which is done once for each concurrently running query.

The size of the table slows down the insertion of indexes by log N, assuming B-tree indexes.

You can use the following methods to speed up inserts:

If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements. If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster. See Section 5.1.4, “Server System Variables”.
If multiple clients are inserting a lot of rows, you can get higher speed by using the INSERT DELAYED statement. See Section 12.2.5.2, “INSERT DELAYED Syntax”.
For a MyISAM table, you can use concurrent inserts to add rows at the same time that SELECT statements are running, if there are no deleted rows in middle of the data file. See Section 7.3.3, “Concurrent Inserts”.
When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using INSERT statements. See Section 12.2.6, “LOAD DATA INFILE Syntax”.
With some extra work, it is possible to make LOAD DATA INFILE run even faster for a MyISAM table when the table has many indexes. Use the following procedure:
1. Optionally create the table with CREATE TABLE.
2. Execute a FLUSH TABLES statement or a mysqladmin flush-tables command.
3. Use myisamchk --keys-used=0 -rq /path/to/db/tbl_name. This removes all use of indexes for the table.
4. Insert data into the table with LOAD DATA INFILE. This does not update any indexes and therefore is very fast.
5. If you intend only to read from the table in the future, use myisampack to compress it. See Section 13.5.3.3, “Compressed Table Characteristics”.
6. Re-create the indexes with myisamchk -rq /path/to/db/tbl_name. This creates the index tree in memory before writing it to disk, which is much faster that updating the index during LOAD DATA INFILE because it avoids lots of disk seeks. The resulting index tree is also perfectly balanced.
7. Execute a FLUSH TABLES statement or a mysqladmin flush-tables command.
LOAD DATA INFILE performs the preceding optimization automatically if the MyISAM table into which you insert data is empty. The main difference between automatic optimization and using the procedure explicitly is that you can let myisamchk allocate much more temporary memory for the index creation than you might want the server to allocate for index re-creation when it executes the LOAD DATA INFILE statement.

You can also disable or enable the nonunique indexes for a MyISAM table by using the following statements rather than myisamchk. If you use these statements, you can skip the FLUSH TABLE operations:
```
ALTER TABLE tbl_name DISABLE KEYS;
ALTER TABLE tbl_name ENABLE KEYS;
```
To speed up INSERT operations that are performed with multiple statements for nontransactional tables, lock your tables:
```
LOCK TABLES a WRITE;
INSERT INTO a VALUES (1,23),(2,34),(4,33);
INSERT INTO a VALUES (8,26),(6,29);
...
UNLOCK TABLES;
```
This benefits performance because the index buffer is flushed to disk only once, after all INSERT statements have completed. Normally, there would be as many index buffer flushes as there are INSERT statements. Explicit locking statements are not needed if you can insert all rows with a single INSERT.

To obtain faster insertions for transactional tables, you should use START TRANSACTION and COMMIT instead of LOCK TABLES.

Locking also lowers the total time for multiple-connection tests, although the maximum wait time for individual connections might go up because they wait for locks. Suppose that five clients attempt to perform inserts simultaneously as follows:
- Connection 1 does 1000 inserts
- Connections 2, 3, and 4 do 1 insert
- Connection 5 does 1000 inserts
If you do not use locking, connections 2, 3, and 4 finish before 1 and 5. If you use locking, connections 2, 3, and 4 probably do not finish before 1 or 5, but the total time should be about 40% faster.

INSERT, UPDATE, and DELETE operations are very fast in MySQL, but you can obtain better overall performance by adding locks around everything that does more than about five successive inserts or updates. If you do very many successive inserts, you could do a LOCK TABLES followed by an UNLOCK TABLES once in a while (each 1,000 rows or so) to allow other threads access to the table. This would still result in a nice performance gain.

INSERT is still much slower for loading data than LOAD DATA INFILE, even when using the strategies just outlined.
To increase performance for MyISAM tables, for both LOAD DATA INFILE and INSERT, enlarge the key cache by increasing the key_buffer_size system variable. See Section 7.5.3, “Tuning Server Parameters”.

MySQL Enterprise. For more advice on optimizing the performance of your server, subscribe to the MySQL Enterprise Monitor. Numerous advisors are dedicated to monitoring performance. For more information, see http://www.mysql.com/products/enterprise/advisors.html.

Previous / Next / Up / Table of Contents

User Comments

Posted by Nicolas Moldavsky on May 31 2003 4:42am

[Delete] [Edit]

"If you are inserting a lot of rows from different clients, you can get higher speed by using the INSERT DELAYED statement."

I am running an application which inserts 20 to 120 rows per second on a single table (Format: Fixed/Row size: 46/No free data) using 5 to 10 client threads. I was using INSERT DELAYED but the delayed insert thread was consuming a lot of CPU and every 5 seconds all the insert threads would go into "waiting for handler lock" status for about 600 ms. The application also runs also selects and updates in the following proportion:
INSERT: 28%
SELECT: 28%
UPDATE: 44%

During that 600ms every 5 seconds, mysql didn't output anything to any thread (I checked it by monitoring network traffic).

Today I changed the INSERT DELAYED to a simple INSERT and CPU usage went down by 70% and no more intermitent locks are experienced.

Platform: Dual Intel Pentium III - Linux 2.4.20-13.7smp - MySQL 4.0.12-standard. Queries per second avg: 482.485

Posted by [name withheld] on August 5 2004 6:31am

[Delete] [Edit]

The suggestion of wrapping in transactions if you make more than about 5 modifications is in my experience a bit high.
I've seen massive performance increases in just wrapping 3 inserts. I imagine the performance gain is whenever you make more than a single insertion.

Posted by [name withheld] on November 1 2005 7:44pm

[Delete] [Edit]

Summary

--tmpdir (and pointing it at a large partition with plenty of free space)
may be useful if you have a small /tmp partition and are using myisamchk on a large table.

Details

Got a table with 36 million rows, 3 columns.

mysql> select count(*) from pagelinks;

+----------+
| count(*) |
+----------+
| 36785885 |
+----------+

mysql> desc pagelinks;

+--------------+-----------------+------+-----+---------+-------+
| Field        | Type            | Null | Key | Default | Extra |
+--------------+-----------------+------+-----+---------+-------+
| pl_from      | int(8) unsigned |      | PRI | 0       |       |
| pl_namespace | int(11)         |      | PRI | 0       |       |
| pl_title     | varchar(255)    |      | PRI |         |       |
+--------------+-----------------+------+-----+---------+-------+

which I populated using

LOAD DATA INFILE '/home/murray/tagging/wikipedia/tabbed.en_pagelinks.dat' INTO TABLE wikipedia.pagelinks;

# ls -ltrh tabbed.en_pagelinks.dat
-rw-r--r-- 1 murray murray 875M Nov 1 14:03 tabbed.en_pagelinks.dat

when running

myisamchk -r -q /var/lib/mysql/wikipedia/pagelinks

eventually causes multiple

myisamchk: Disk is full writing '/tmp/ST71pY9X' (Errcode: 28). Waiting for someone to free space... Retry in 60 secs

df -m /tmp shows

/dev/hda7 1786 1786 0 100% /tmp

but there is nothing actually in /tmp

# cd /
# du -sm tmp
1 tmp/

Fix was to use --tmpdir and point it at a partition with more than 2Gb of space.

myisamchk -r -q --tmpdir=/home/data/tmp /var/lib/mysql/wikipedia/pagelinks

# mysql -V
mysql Ver 14.7 Distrib 4.1.11, for pc-linux-gnu (i386)
debian stable/sarge

Posted by Mike Looijmans on November 24 2005 9:45am

[Delete] [Edit]

using LOCK TABLE .. WRITE may speed up the insertion, but when processing as well as loading rows, you will lock out any reader. A select on primary key (which should take close to 0 seconds) takes up to 0.2 seconds when my loader is busy loading batches of 1000 rows.

A very good workaround is to put the new rows into a temp table, then flush that table every now and then in a single tansaction. This makes the insert very quick, because all the checking and conversion was already done in the temp table.

CREATE TEMPORARY TABLE ..._temp (....) ENGINE=MEMORY;

-- Loop from here --
DELETE FROM .._temp;
INSERT INTO .._temp ... VALUES (...);
INSERT INTO .._temp ... VALUES (...);

LOCK TABLE ... WRITE, ..._temp READ;
INSERT INTO ... SELECT * FROM ..._temp;
UNLOCK TABLES;
-- Repeat until done --

DROP TABLE ..._temp;

What I've seen in my system, is that the INSERTs go at about the same speed as when locking the table, but the clients do not notice any performance loss at all when the loader is running.

Posted by Jonathan Lampe on March 30 2006 10:02pm

[Delete] [Edit]

LOCK TABLES do not appear to speed up INSERT...SELECT queries. (At least under MySQL 4.0)

For example...

LOCK TABLES Folders READ, FolderPerms WRITE;
INSERT INTO FolderPerms
SELECT 'asc' as SID,
FolderID1.ID,Folders.InstID,Folders.Name,
Folders.FolderPath FROM FolderID1
LEFT JOIN Folders ON FolderID1.ID=Folders.ID GROUP BY FolderID1.ID;
UNLOCK TABLES;

...runs just as fast with or without the LOCK/UNLOCK statements.

Posted by Kyle Drake on May 18 2006 6:10am

[Delete] [Edit]

If you have a situation where there are a lot of INSERTs taking place for an extended period of time (such as with a large import), edit your /etc/fstab to turn off access times with the noatime option for the partition that the data is located in. With access times enabled for the mounted partition (the default behavior), the OS has to write to the disk partition every time you access the database (even if it's just a read). On large imports, this can clog up your I/O buffers. This was a problem I had on a FreeBSD 5.4 machine. When I turned off access time, INSERT performance increased severalfold.

An example /etc/fstab entry without, and then with noatime, consult your manpages for OS-specific settings:

/dev/da0s1a /partition ufs rw 2 2
/dev/da0s1a /partition ufs rw,noatime 2 2

Posted by Vitaly Ignatovich on January 23 2007 4:27am

[Delete] [Edit]

If you have a big ammount of data to insert through a select ... insert query, it might be helpfull to divide it into smaller portions. I had a table with approx 30 000 000 records with 6 fields of the type double. If I try to use a select ... insert to copy all records at once into another empty table, it wouldn't go at all. But when copying 2000000 records at a time it took about 10 minutes for the whole table.
It seems that mysql executes a select completely and then starts inserting.

Posted by Christopher Brown on January 4 2008 5:04am

[Delete] [Edit]

Be aware that if you are use replication and set sync_binlog = 1, this can have a large impact on the speed of INSERT/UPDATE/DELETE operations especially if you are using mysqlimport or LOAD DATA LOCAL INFILE.

Using sync_binlog=1 on Solaris 10, I noted a speed decrease to 25% of the speed as compared to the default sync_binlog = 0.

Add your own comment.

Top / Previous / Next / Up / Table of Contents

7.2.22. Speed of INSERT Statements

User Comments

7.2.22. Speed of `INSERT` Statements