It is a common situation in database applications that the primary key is a unique identifier and new rows are inserted in the ascending order of the primary key. Thus, insertions into the clustered index do not require random reads from a disk.
On the other hand, secondary indexes are usually nonunique, and
insertions into secondary indexes happen in a relatively random
order. This would cause a lot of random disk I/O operations
without a special mechanism used in InnoDB
.
If an index record should be inserted into a nonunique secondary
index, InnoDB
checks whether the secondary
index page is in the buffer pool. If that is the case,
InnoDB
does the insertion directly to the
index page. If the index page is not found in the buffer pool,
InnoDB
inserts the record to a special insert
buffer structure. The insert buffer is kept so small that it
fits entirely in the buffer pool, and insertions can be done
very fast.
Periodically, the insert buffer is merged into the secondary index trees in the database. Often it is possible to merge several insertions into the same page of the index tree, saving disk I/O operations. It has been measured that the insert buffer can speed up insertions into a table up to 15 times.
The insert buffer merging may continue to happen
after the inserting transaction has been
committed. In fact, it may continue to happen after a server
shutdown and restart (see Section 13.6.6.2, “Forcing InnoDB
Recovery”).
Insert buffer merging may take many hours when many secondary
indexes must be updated and many rows have been inserted. During
this time, disk I/O will be increased, which can cause
significant slowdown on disk-bound queries. Another significant
background I/O operation is the purge thread (see
Section 13.6.9, “InnoDB
Multi-Versioning”).
User Comments
If you type show innodb status\G, it will show - under section "INSERT BUFFER AND ADAPTIVE HASH INDEX" line like that:
54737 inserts, 12769 merged recs, 3612 merges
From what I have understood, this basically shows that 54 000 records have been inserted, but only 12 000 have been merged into indexes. The trick I have just found is that even after you stop last query to the database, the database still keeps on doing heavy IO, and what is happening then is that the remaining records are merged. The procedure continues until "inserts"=="merged recs". Only after that the IO really stops.
In my case the merging procedure takes about 3hrs after the last query. Even if you stop the database by shutdown command during merging, when you turn it on again, it will continue to merge the rows anyway. Only then it will make it slower, or so it seems from my experience. Dunno why. Also, if you shut down the database and then start it and it will continue to merge records, the "inserts" counter will be zeroed, so you will have not the slightest idea, how long to wait untill it finishes ;-)
Tip: DO NOT shut down the database until you see that everything is merged. Also, keep an eye on the difference between those values. If it grows constantly during normal operation, you are really missing some resources on your computer (probably IO). Also, merging is usually the reason for big IO even when the traffic drops down and you'd expect the database to perform faster, but it does not ;-) In the worst possible scenario (happened to me), the server was working on maximum IO rate, but was completely unusable, all IO went down to merging. This "deadlock" could only be resolved manually by stopping all queries for 6hrs...
Add your own comment.