How do I optimize MySQL's queries with constants?

MySQL builds different query plans for different values of bound parameters In this article you can read the list of when does the MySQL optimizer does what: Action When Query parse PREPARE Negation elimination PREPARE Subquery re-writes PREPARE Nested JOIN simplification First EXECUTE OUTER->INNER JOIN conversions First EXECUTE Partition pruning Every EXECUTE COUNT/MIN/MAX elimination Every EXECUTE Constant subexpression removal Every EXECUTE Equality propagation Every EXECUTE Constant table detection Every EXECUTE ref access analysis Every EXECUTE range/index_merge analysis and optimization Every EXECUTE Join optimization Every EXECUTE There is one more thing missing in this list MySQL can rebuild a query plan on every JOIN iteration : a such called range checking for each record If you have a composite index on a table: CREATE INDEX ix_table2_col1_col2 ON table2 (col1, col2) and a query like this: SELECT * FROM table1 t1 JOIN table2 t2 ON t2. Col1 = t1. Value1 AND t2.

Col2 BETWEEN t1. Value2_lowerbound AND t2. Value2_upperbound MySQL will NOT use an index RANGE access from (t1.

Value1, t1. Value2_lowerbound) to (t1. Value1, t1.

Value2_upperbound) Instead, it will use an index REF access on (t1. Value) and just filter out the wrong values But if you rewrite the query like this: SELECT * FROM table1 t1 JOIN table2 t2 ON t2. Col1 = t2.

Value1 AND t2. Col2 BETWEEN t1. Value2_lowerbound AND t2.

Value2_upperbound then MySQL will recheck index RANGE access for each record from table1 and decide whether to use RANGE access on the fly You can read about it in these articles in my blog: Selecting timestamps for a time zone how to use coarse filtering to filter out timestamps without a timezone Emulating SKIP SCAN how to emulate SKIP SCAN access method in MySQL Analytic functions: optimizing LAG, LEAD, FIRST_VALUE, LAST_VALUE how to emulate Oracle's analytic funtions in MySQL Advanced row sampling how to select N records from each group in MySQL All these things employ RANGE CHECKING FOR EACH RECORD Returning to your question: there is no way to tell which plan will MySQL use for every given constant, since there is no plan before the constant is given Unfortunately, there is no way to force MySQL to use one query plan for every value of a bound parameter You can control the JOIN order and INDEX es being chosen by using STRAIGHT_JOIN and FORCE INDEX clauses, but they will not force a certain access path on an index or forbid the IMPOSSIBLE WHERE On the other hand, for all JOIN s MySQL employs only NESTED LOOPS That means that if you build right JOIN order or choose right indexes MySQL will probably benefit from all IMPOSSIBLE WHERE s.

For example say my client side code is generating a query with a number in it's where clause. Some times the number will result in an impossible where clause other times it won't. How can I use explain to examine how well optimized the query is?

MySQL builds different query plans for different values of bound parameters. In this article you can read the list of when does the MySQL optimizer does what: Action When Query parse PREPARE Negation elimination PREPARE Subquery re-writes PREPARE Nested JOIN simplification First EXECUTE OUTER->INNER JOIN conversions First EXECUTE Partition pruning Every EXECUTE COUNT/MIN/MAX elimination Every EXECUTE Constant subexpression removal Every EXECUTE Equality propagation Every EXECUTE Constant table detection Every EXECUTE ref access analysis Every EXECUTE range/index_merge analysis and optimization Every EXECUTE Join optimization Every EXECUTE There is one more thing missing in this list. MySQL can rebuild a query plan on every JOIN iteration: a such called range checking for each record.

If you have a composite index on a table: CREATE INDEX ix_table2_col1_col2 ON table2 (col1, col2) and a query like this: SELECT * FROM table1 t1 JOIN table2 t2 ON t2. Col1 = t1. Value1 AND t2.

Col2 BETWEEN t1. Value2_lowerbound AND t2. Value2_upperbound , MySQL will NOT use an index RANGE access from (t1.

Value1, t1. Value2_lowerbound) to (t1. Value1, t1.

Value2_upperbound). Instead, it will use an index REF access on (t1. Value) and just filter out the wrong values.

But if you rewrite the query like this: SELECT * FROM table1 t1 JOIN table2 t2 ON t2. Col1 = t2. Value1 AND t2.

Col2 BETWEEN t1. Value2_lowerbound AND t2. Value2_upperbound , then MySQL will recheck index RANGE access for each record from table1, and decide whether to use RANGE access on the fly.

You can read about it in these articles in my blog: Selecting timestamps for a time zone - how to use coarse filtering to filter out timestamps without a timezone Emulating SKIP SCAN - how to emulate SKIP SCAN access method in MySQL Analytic functions: optimizing LAG, LEAD, FIRST_VALUE, LAST_VALUE - how to emulate Oracle's analytic funtions in MySQL Advanced row sampling - how to select N records from each group in MySQL All these things employ RANGE CHECKING FOR EACH RECORD Returning to your question: there is no way to tell which plan will MySQL use for every given constant, since there is no plan before the constant is given. Unfortunately, there is no way to force MySQL to use one query plan for every value of a bound parameter. You can control the JOIN order and INDEX'es being chosen by using STRAIGHT_JOIN and FORCE INDEX clauses, but they will not force a certain access path on an index or forbid the IMPOSSIBLE WHERE.

On the other hand, for all JOIN's, MySQL employs only NESTED LOOPS. That means that if you build right JOIN order or choose right indexes, MySQL will probably benefit from all IMPOSSIBLE WHERE's.

Nice commentary, but I think you are still skipping over my point: What I am/was looking for is a way to ask the query optimizer for query plans (plural) for different possible and impossible where clauses without having to back out values that trigger them. -- I could see a tool that just runs the optimizer and "forks" for every question asked and spits out every query plan that is generated (clearly some user pruning would be needed) so I can see what different plans I might end up with. – BCS May 4 '09 at 20:39 1 You mean, all possible plans for all possible values?

There are 2^32 of integers alone, to say nothing of VARCHAR's – Quassnoi May 4 '09 at 20:41 WHERE 1 SELECT * FROM table1 WHERE id = 1 will result in an IMPOSSIBLE WHERE if ID is a PRIMARY KEY and there is no record with ID = 1 – Quassnoi May 4 '09 at 20:51 Greate explanation! – Yosef May 4 '09 at 23:11.

You are getting "Impossible WHERE noticed" because the value you specified is not in the column, not just because it is a constant. You could either 1) use a value that exists in the column or 2) just say col = col: explain select cols from tbl where col = col.

Nether of those solve my problem. I want to know what the query plan is where the optimizer doesn't know if the value is in the column. – BCS Nov 23 '08 at 3:01 Here's how it works: The optimizer determines if the select is possible by reading the const and system tables, if it is then you get the query plan.My solution will give you the query plan because the optimizer won't stop early because it thinks the query is impossible.

– Robert Gamble Nov 23 '08 at 3:28 Yes, it will give you a query plan and avoid the issue with the where clause never passing, but it will then assume that the where clause will always pass, and that also isn't the case either. What I want is to know how things will perform for both cases.(see edit2) – BCS Nov 23 '08 at 21:19.

By using indexes on the specific columns (or even on combination of columns if you always query the given columns together). If you have indexes, the query planner will potentially use them. Regarding "impossible" values: the query planner can conclude that a given value is not in the table from several sources: if there is an index on the particular column, it can observe that the particular value is large or smaller than any value in the index (min/max values take constant time to extract from indexes) if you are passing in the wrong type (if you are asking for a numeric column to be equal with a text) PS.

In general, creation of the query plan is not expensive and it is better to re-create than to re-use them, since the conditions might have changed since the query plan was generated and a better query plan might exists.

It seems you are answering the same question as everyone else has, but not the one I am asking. I'll try, yet again, editing the question. – BCS Mar 5 '09 at 18:56.

NOTE: the original question is moot but scan to the bottom for something relevant. I want to know what keys are being used but whatever I pass to explain, it is able to optimize the where clause to nothing ("Impossible WHERE noticed...") because I fed it a constant. Is there a way to tell mysql to not do constant optimizations in explain?

Am I missing something? Is there a better way to get the info I need? Edit: EXPLAIN seems to be giving me the query plan that will result from constant values.

As the query is part of a stored procedure (and IIRC query plans in spocs are generated before they are called) this does me no good because the value are not constant. What I want is to find out what query plan the optimizer will generate when it doesn't known what the actual value will be. Am I missing soemthing?

Edit2: Asking around elsewhere, it seems that MySQL always regenerates query plans unless you go out of your way to make it re-use them. Even in stored procedures. From this it would seem that my question is moot.

However that doesn't make what I really wanted to know moot: How do you optimize a query that contains values that are constant within any specific query but where I, the programmer, don't known in advance what value will be used? -- For example say my client side code is generating a query with a number in it's where clause. Some times the number will result in an impossible where clause other times it won't.

How can I use explain to examine how well optimized the query is? The best approach I'm seeing right off the bat would be to run EXPLAIN on it for the full matrix of exist/non-exist cases. Really that isn't a very good solution as it would be both hard and error prone to do by hand.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions