MySQL :: MySQL 5.1 with Maria Reference Manual :: 12.2.9.7 Correlated Subqueries

MySQL 5.1 with Maria Reference Manual :: 12 SQL Statement Syntax :: 12.2 Data Manipulation Statements :: 12.2.9 Subquery Syntax :: 12.2.9.7 Correlated Subqueries

« 12.2.9.6 EXISTS and NOT EXISTS

12.2.9.8 Subqueries in the FROM clause »

Section Navigation [Toggle]

12.2.9.7. Correlated Subqueries

A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query. For example:

SELECT * FROM t1 WHERE column1 = ANY
       (SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);

Notice that the subquery contains a reference to a column of t1, even though the subquery's FROM clause does not mention a table t1. So, MySQL looks outside the subquery, and finds t1 in the outer query.

Suppose that table t1 contains a row where column1 = 5 and column2 = 6; meanwhile, table t2 contains a row where column1 = 5 and column2 = 7. The simple expression ... WHERE column1 = ANY (SELECT column1 FROM t2) would be TRUE, but in this example, the WHERE clause within the subquery is FALSE (because (5,6) is not equal to (5,7)), so the subquery as a whole is FALSE.

Scoping rule: MySQL evaluates from inside to outside. For example:

SELECT column1 FROM t1 AS x
  WHERE x.column1 = (SELECT column1 FROM t2 AS x
    WHERE x.column1 = (SELECT column1 FROM t3
      WHERE x.column2 = t3.column1));

In this statement, x.column2 must be a column in table t2 because SELECT column1 FROM t2 AS x ... renames t2. It is not a column in table t1 because SELECT column1 FROM t1 ... is an outer query that is farther out.

For subqueries in HAVING or ORDER BY clauses, MySQL also looks for column names in the outer select list.

For certain cases, a correlated subquery is optimized. For example:

val IN (SELECT key_val FROM tbl_name WHERE correlated_condition)

Otherwise, they are inefficient and likely to be slow. Rewriting the query as a join might improve performance.

Aggregate functions in correlated subqueries may contain outer references, provided the function contains nothing but outer references, and provided the function is not contained in another function or expression.

Previous / Next / Up / Table of Contents

User Comments

Posted by Are you mortal? Then prepare to die. on November 24 2004 2:35pm

[Delete] [Edit]

Another example of when a subquery is optimized is when using a subquery based on a multipart primary key to 'join' the two tables.

For example

[sql]
DELETE FROM t1 WHERE ROW(c1,c2) IN (
SELECT c1, c2 FROM t2
);
[/sql]

You could easily select the above 'join' using a 'where t1.c1=t2.c1 and t1.c2=t2.c2', but a delete statement won't know what to do (and fails) over the joined tables.

Posted by Ryan Findley on January 11 2006 8:00pm

[Delete] [Edit]

Note that according to bug #9090 ( http://bugs.mysql.com/bug.php?id=9090 ) it appears that ALL subqueries will be executed as correlated subqueries until version 5.2 or 6.0, when the optimizer may gain the ability to correctly execute uncorrelated subqueries.

Posted by Stephen Dewey on August 18 2006 2:28am

[Delete] [Edit]

Ryan Findley is good to point out the bug report, but as far as I can tell it is not true that MySQL rewrites ALL subqueries as correlated subqueries. Rather, the problem that the bug report identifies is that the MySQL optimizer does that to subqueries that test an IN statement. This is listed as one of the current limitations of MySQL's subquery support here (third item down, "Subquery optimization for IN..."):

http://dev.mysql.com/doc/refman/5.0/en/subquery-restrictions.html

So if you are writing a subquery that does not use IN, I think that you will be able to keep it uncorrelated. However, I have not tested this.

Posted by Tobias F on April 12 2007 12:03pm

[Delete] [Edit]

If you have a slow 'correlated' subquery with IN, you can optimize it with a join to get around the bug described by Ryan and Stephen. After the optimization the execution time is no longer O(M×N).

the queries are made up for this example.
original query (execution time 4 sec):
SELECT * FROM employee WHERE room = 15 OR
room IN (
SELECT room FROM assistant WHERE building = 11
)

optimized query (execution time 0,004 sec):
SELECT e1.* FROM employee e1
INNER JOIN (
SELECT room FROM assistant WHERE building = 11
) ea ON e1.room = ea.room
UNION SELECT * FROM employee WHERE room = 15

if you have a simple query like
SELECT * FROM employee WHERE room IN (
SELECT room FROM assistant WHERE building = 11
)
just leave the UNION operator

Add your own comment.

Top / Previous / Next / Up / Table of Contents