Code and slides: Batched Key Access feature preview

At the moment, there are two big features in the works in MySQL optimizer - Subquery optimizations and Batched Key Access. While the former is a part of MySQL 6.0, I wrote about it here in my blog, and so forth, the latter was in nearly stealth mode until a couple of weeks ago.

That’s no longer the case:

  • Batched Key Access source code is now published as mysql-6.0-bka-preview tree.
  • Igor Babaev, the author of the feature, gave a talk about Batched Key Access at MySQL User Conference, and the slides are available here.
  • There is now Batched Key Access page on forge.mysql.com which a semi-official feature homepage which describes how you can get source, binaries, etc

If you’re developing any kind of “remote” storage engine or are using MySQL in a data warehousing application where disk IO is the bottleneck, I recommend to take a look.

Posted in joins on April 20th, 2008 by spetrunia | | 0 Comments

Nested-loops join speedup idea promoted to WL task

The idea how to speed up nested-loops join a little I’ve mentioned earlier has now been promoted into a worklog entry. It is now known as WL#3724 “Short-Cutting Join Execution: Speeding up star queries” and its text is publicly available at MySQLForge.

At the moment there is only a short description, but hopefully Martin Hansson (the assigned developer) will add more content there.

Posted in ideas, joins on April 18th, 2007 by spetrunia | | 0 Comments

How to find out if an outer join was converted to inner

After this post I’ve got a question how one can tell if his outer join was converted to inner. You can find it out by looking at the warning generated by EXPLAIN EXTENDED. If the outer join wasn’t converted, you’ll see it in the rewritten query in the warning:

mysql> explain extended select * from t1 left join (t2, t3) on t2.a= t1.a;
...
3 rows in set, 1 warning (0.00 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`t1`.`a` AS `a`,`test`.`t2`.`a` AS `a`,`test`.`t3`.`a`
AS `a` from `test`.`t1` left join (`test`.`t2` join `test`.`t3`) on ((`test`.`t2`.
`a` = `test`.`t1`.`a`)) where 1

In this query LEFT JOIN is not converted to inner.

Now let’s try a query where outer join will be converted:

mysql> explain extended select * from t1 left join (t2, t3) on t2.a= t1.a where t2.a < 10;
...
3 rows in set, 1 warning (0.00 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`t1`.`a` AS `a`,`test`.`t2`.`a` AS `a`,`test`.`t3`.`a`
AS `a` from `test`.`t1` join `test`.`t2` join `test`.`t3` where ((`test`.`t1`.
`a` = `test`.`t2`.`a`) and (`test`.`t2`.`a` < 10))

Looking at the warning text we see an inner join instead of the left join, and also we see that the ON clause has been added into the WHERE.


Yes, those warning messages are hard to read, they have excessive quoting and the lines are too long. But at the moment certain kinds of query plan details are displayed only there, so skiming through the rewritten query text may pay off when you have doubts about what is going on.

Posted in joins, how-it-works on February 24th, 2007 by spetrunia | | 0 Comments

Nested outer joins

Here is MySQL’s nested outer joins optimization cheat sheet:

  • Conversions:
    • RIGHT JOIN is converted to LEFT JOIN. FULL JOIN is not supported.
    • Outer joins are converted to inner joins when possible
  • Constraints on join order:

    • “Outer tables go first”
    • “No interleaving”
  • Table access rules:

    • “Inner” table access methods are constructed from parts of the ON condition. WHERE condition can’t be used to construct table accesses.
    • Parts of ON condition are checked as soon as possible
    • Parts of the WHERE condition
      - are not checked until we’ve found a row combination that matches the ON clause
      - are checked as soon as possible after that.

Or, in more detail:

Conversions

RIGHT JOIN to LEFT JOIN conversion is obvious:

  (t1 RIGHT JOIN t2 ON cond) = (t2 LEFT JOIN t1 ON cond)

Conversion from outer to inner join is possible when the result of inner join will be the same. It will be the same if the row combination with NULL-complimented row will not pass the WHERE clause. For example, if we look at the query

  t1 LEFT JOIN t2 ON some_cond WHERE t2.a=t1.b

we’ll see that a row with t2.a IS NULL will not satisfy the WHERE condition. Hence, this outer join can be converted to inner.

Constraints on join order

Outer tables go first
any outer table used in the outer join’s ON clause must be before all of the inner tables.
No interleaving
tables contained within an outer join must form a continuous sequence in the join order. Interleaving with tables that are outside of the outer join is not allowed.

Table access rules

Now, this requires some explanation. MySQL’s nested-loops join code tries to check parts of the WHERE as soon as
possible. For example when a query

SELECT * FROM
  t1,t2, ...
WHERE
  t1.col1=c1 AND
  t2.col1=t1.col2 AND t2.col2=c3 AND
  ...

is executed using a join order of (t1, t2,…), it proceeds according to this kind of scenario:

Inner join swimlanes

We can see here that the the WHERE condition is split into parts that are checked “as early as possible”.

With outer joins is more complicated. We need to know if we’ll need to generate a NULL-complemented row combination. We won’t need to if there was a combination of inner tables that matched the ON (but not necessarily the WHERE) clause. The solution is to switch the WHERE parts checking on and off.

The best way to show it is with example: Suppose we have a query

 SELECT * FROM ... ot1 LEFT JOIN (it1, it2) ON somecond WHERE ...

and suppose the join order is (ot1, it1, it2, …). The execution will proceed in this manner:

outer join swimlanes

What’s visible there? When we start scanning table it1, we check only the ON condition. We can’t check the WHERE - we could iscard some it1’s row that is the only row that will match the ON condition, think there will be no matches, and erroneously generate the NULL-complimented row.

After we’ve found the match for the ON condition, we go back and check all parts of the WHERE we did not check because of the above mentioned reason.

After that, the execution proceeds as if this was an inner join, with ON merged into the WHERE clause.

The diagram also shows why we can’t use parts of the WHERE clause to create table acccess methods: because there are times when we can’t use parts of the WHERE for filtering. We always can use parts of the ON though.

Now it should be clear where all Table Access Rules came from.

Posted in joins, how-it-works on February 22nd, 2007 by spetrunia | | 2 Comments