The most efficient way to process GROUP BY
is when an index is used to directly retrieve the grouping
columns. With this access method, MySQL uses the property of
some index types that the keys are ordered (for example,
BTREE
). This property enables use of lookup
groups in an index without having to consider all keys in the
index that satisfy all WHERE
conditions.
This access method considers only a fraction of the keys in an
index, so it is called a loose index
scan. When there is no WHERE
clause, a loose index scan reads as many keys as the number of
groups, which may be a much smaller number than that of all
keys. If the WHERE
clause contains range
predicates (see the discussion of the
range
join type in
Section 7.2.1, “Optimizing Queries with EXPLAIN
”), a loose index scan looks up
the first key of each group that satisfies the range
conditions, and again reads the least possible number of keys.
This is possible under the following conditions:
The query is over a single table.
The GROUP BY
names only columns that
form a leftmost prefix of the index and no other columns.
(If, instead of GROUP BY
, the query has
a DISTINCT
clause, all distinct
attributes refer to columns that form a leftmost prefix of
the index.) For example, if a table t1
has an index on (c1,c2,c3)
, loose index
scan is applicable if the query has GROUP BY c1,
c2,
. It is not applicable if the query has
GROUP BY c2, c3
(the columns are not a
leftmost prefix) or GROUP BY c1, c2, c4
(c4
is not in the index).
The only aggregate functions used in the select list (if
any) are MIN()
and
MAX()
, and all of them
refer to the same column. The column must be in the index
and must follow the columns in the GROUP
BY
.
Any other parts of the index than those from the
GROUP BY
referenced in the query must
be constants (that is, they must be referenced in
equalities with constants), except for the argument of
MIN()
or
MAX()
functions.
For columns in the index, full column values must be
indexed, not just a prefix. For example, with c1
VARCHAR(20), INDEX (c1(10))
, the index cannot be
used for loose index scan.
If loose index scan is applicable to a query, the
EXPLAIN
output shows
Using index for group-by
in the
Extra
column.
Assume that there is an index idx(c1,c2,c3)
on table t1(c1,c2,c3,c4)
. The loose index
scan access method can be used for the following queries:
SELECT c1, c2 FROM t1 GROUP BY c1, c2; SELECT DISTINCT c1, c2 FROM t1; SELECT c1, MIN(c2) FROM t1 GROUP BY c1; SELECT c1, c2 FROM t1 WHERE c1 <const
GROUP BY c1, c2; SELECT MAX(c3), MIN(c3), c1, c2 FROM t1 WHERE c2 >const
GROUP BY c1, c2; SELECT c2 FROM t1 WHERE c1 <const
GROUP BY c1, c2; SELECT c1, c2 FROM t1 WHERE c3 =const
GROUP BY c1, c2;
The following queries cannot be executed with this quick select method, for the reasons given:
There are aggregate functions other than
MIN()
or
MAX()
:
SELECT c1, SUM(c2) FROM t1 GROUP BY c1;
The columns in the GROUP BY
clause do
not form a leftmost prefix of the index:
SELECT c1, c2 FROM t1 GROUP BY c2, c3;
The query refers to a part of a key that comes after the
GROUP BY
part, and for which there is
no equality with a constant:
SELECT c1, c3 FROM t1 GROUP BY c1, c2;
Were the query to include WHERE c3 =
, loose index
scan could be used.
const
As of MySQL 5.5, the loose index scan access method can be
applied to other forms of aggregate function references in the
select list, in addition to the
MIN()
and
MAX()
references already
supported:
AVG(DISTINCT)
,
SUM(DISTINCT)
, and
COUNT(DISTINCT)
are
supported. AVG(DISTINCT)
and SUM(DISTINCT)
take a
single argument.
COUNT(DISTINCT)
can have
more than one column argument.
There must be no GROUP BY
or
DISTINCT
clause in the query.
The loose scan limitations described earlier still apply.
Assume that there is an index idx(c1,c2,c3)
on table t1(c1,c2,c3,c4)
. The loose index
scan access method can be used for the following queries:
SELECT COUNT(DISTINCT c1), SUM(DISTINCT c1) FROM t1; SELECT COUNT(DISTINCT c1, c2), COUNT(DISTINCT c2, c1) FROM t1;
Loose index scan is not applicable for the following queries:
SELECT DISTINCT COUNT(DISTINCT c1) FROM t1; SELECT COUNT(DISTINCT c1) FROM t1 GROUP BY c1;
User Comments
Add your own comment.