Looking at MySQL 5.6’s optimizer: EXPLAIN UPDATE

MySQL 5.6 adds support for EXPLAIN UPDATE. This is a useful feature, so we want to have it in MariaDB 10.0, too. Besides that, MariaDB 10.0 has SHOW EXPLAIN feature, and we want it work for UPDATE commands, too.

Now, a bit of code history. Why didn’t MySQL have EXPLAIN UPDATE from the start, like other database systems? To the uninformed, lack of EXPLAIN UPDATE looks like simple lazyness. After all, everyone who has read a database textbook can imagine that the code should have this form:

run_update_query(UpdateQuery q) {
   QueryPlan qp= optimize_query(q);
   run_query_plan(qp);
}

and adding EXPLAIN UPDATE is a matter of adding another function:

run_explain_update(UpdateQuery q) {
   QueryPlan qp= optimize_query(q);
   print_update_query_plan(qp);
}

print_update_query_plan(QueryPlan qp)
{
  // print the plan for UPDATE.
}

Seems like a task for an intern. The problem is that MySQL’s code is not structured this way. There is no point in time where all decisions about
how to run the UPDATE command have been made and stored in a certain data structure, but the query execution didn’t start yet. The code basically is structured like this:

mysql_update()
{
  typename part_of_query_plan;

  …do something…
  part_of_query_plan=…;
    if (all done)
      return;
  …do something …

  if (…)
  {
    typename another_part_of_query_plan;
    …do something else…
    another_part_of_query_plan= …
  }

  typename yet_another_part_of_query_pan= …;
  …
}

It is not trivial to pull out all of query plan choices out of this. Oracle’s optimizer team had two possible options:

  1. Re-write UPDATE handling code to use the textbook approach.
  2. Keep the current code structure, and inject “if (running_explain) {...}” at many locations.

#1 would be a minor revolution. It would introduce new code that is run for every UPDATE query. New code may have bugs. It may cause query plans to change, and not always for the better.
#2 is conservative. It would keep the old structure in place, and would require less work. The result won’t be very impressive, though - there will be a single piece of code that handles both UPDATE and EXPLAIN UPDATE, with lots of “if (running_explain) {...}” all over it.

I guess, the choice depends on your deadlines, what other changes are there, etc. Oracle’s team choose to do #2. However, when I tried playing with it, I’ve found

  • a query plan that has changed since 5.5 (BUG#67638)
  • a wrong query plan - EXPLAIN doesn’t match the execution (BUG#67637)

I’m not sure if BUG#67638 is a problem. Maybe, it is expected because of the changes in the cost model. However, if the change was expected anyway, why did they choose to use the conservative solution for EXPLAIN UPDATE? And if they did choose a conservative solution for EXPLAIN UPDATE, why do we still get bugs like BUG#67637?

The questions are not just of curiosity. We at MariaDB need to apply the patch, and make it work with SHOW EXPLAIN. Do we wait for Oracle to fix the above bugs, or fix them ourselves? Do we stick to their EXPLAIN UPDATE implementation (and keep applying their fixes), or forget it and roll our own? Decisions, decisions…

Posted in Uncategorized on November 20th, 2012 by spetrunia | |

Leave a reply

You must be logged in to post a comment.