MySQL subselect performance question?

I believe that the second is more efficient as it requires only one select, but to be sure, you should EXPLAIN each query and check the results EXPLAIN select tasks. * from tasks where some criteria and task. Project_id not in (select id from project where project.

Is_template = 1); EXPLAIN select tasks. * from tasks, project where some criteria and task. Project_id = project.Id and project.

Is_template 1.

I believe that the second is more efficient as it requires only one select, but to be sure, you should EXPLAIN each query and check the results. EXPLAIN select tasks. * from tasks where some criteria and task.

Project_id not in (select id from project where project. Is_template = 1); EXPLAIN select tasks. * from tasks, project where some criteria and task.

Project_id = project. Id and project. Is_template 1.

Thanks for the EXPLAIN tip. Seems like adding index on project. Is_template helps a lot.

– Marko Kocić Dec 4 '08 at 16:28.

How much difference there is between the two could depend greatly on what "some criteria" is and what opportunities to use indexes it provides. But note that they are not equivalent in terms of results if there are tasks that don't have projects. The second is equivalent to this: select tasks.

* from tasks where some criteria and task. Project_id in (select id from project where project. Is_template 1).

Some criteria" can pretty much reduce the overal number of records returned. Task. Project_id is required, so these 2 queries are equivalent.

I choose "not in" query cause then subselect returns much smaller numer of records then it would if I choose "in". – Marko Kocić Dec 4 '08 at 16:27.

I think the first may scale better: When you do a join, internally mysql makes a sort of temporary table consisting of the two tables joined according to the join conditions specified. You aren't giving a join condition, so it'll create a temp table with all tasks listed against all projects. I'm fairly sure (but do check with the explain tool) that it does this prior to applying any where clauses.

Result: if there are 10 of each, it'll have 10 * 10 rows = 100. You can see how this gets big as numbers rise. It then applies the where to this temporary table.By contrast, the subquery selects only the relevant rows from each table.

But unless scaling is a concern, I don't think it really matters.

No one agrees with me .... I want some discussion. – benlumley Dec 4 '08 at 16:20 Try a simple explain to prove yourself wrong. – ysth Dec 4 '08 at 20:14.

Avoid sub queries like the plague in MySQL versions.

You are tarring with a wide brush there. Can you make your comment specific to this case? It's hard to see how lack of optimization could cause a substantial difference unless it runs the subquery repeatedly for each task row, and performance testing should whether that's the case.

– ysth Dec 4 '08 at 20:13 In every case I've seen, even when the query returns an unchanging set of rows for the IN operation, the subquery is run for each result of the main query. – Grant Limberg Dec 4 '08 at 20:34.

When you do a join, internally mysql makes a sort of temporary table consisting of the two tables joined according to the join conditions specified. You aren't giving a join condition, so it'll create a temp table with all tasks listed against all projects. I'm fairly sure (but do check with the explain tool) that it does this prior to applying any where clauses.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions