Different queries, same result (it seems), completly different performance… Why?

From t_entidades execute subplan 2 by examining a hash table built from scanning t_entidade_actividade An "explain analyze" would be able to tell you how often steps 1.1 and 1.2 were actually run for the query... if the scan in step 1.1 is being done for each row from step 1, then your query time will grow O(n^2) where n is the number of rows in t_entidades, and the temp space used for each iteration of 1.1 will increase as the number of matches in that table increases Your query 2 is much better written, IMHO. Each of the two sets of IDs are produced in quite different ways, so put them in separate queries and use a UNION to merge them together at the end.It also cuts out the useless outer scan of t_entidades in query 1 that just passes through IDs from the where clause. (Not that it's relevant to PostgreSQL, but it also makes it clear that the two scans could be run in parallel and then merged, but never mind) t_entidade_actividade.

Actividade might need an index?

From t_entidades execute subplan 2 by examining a hash table built from scanning t_entidade_actividade An "explain analyze" would be able to tell you how often steps 1.1 and 1.2 were actually run for the query... if the scan in step 1.1 is being done for each row from step 1, then your query time will grow O(n^2) where n is the number of rows in t_entidades, and the temp space used for each iteration of 1.1 will increase as the number of matches in that table increases. Your query 2 is much better written, IMHO. Each of the two sets of IDs are produced in quite different ways, so put them in separate queries and use a UNION to merge them together at the end.

It also cuts out the useless outer scan of t_entidades in query 1 that just passes through IDs from the where clause. (Not that it's relevant to PostgreSQL, but it also makes it clear that the two scans could be run in parallel and then merged, but never mind). T_entidade_actividade.

Actividade might need an index?

Spot on. From all queries in this page, after some testing, the union query is the one that works better when the search term shortens and returned rows increase... Also, thanks for explaining the explain. – andre matos Mar 3 at 10:38.

This is the first postgresql execution plan I see, but it looks like the first plan is doing a table scan on t_entidades and then for each row, it does all the stuff below, including to more table scans. In the second plan it still does the two inner scans but hashaggregates the result. So assuming you have 100 rows in you table the first plan does 201 table scans and the second does 2.

Go figure :-).

The first query is so strange, it can only confuse the queryplanner. The first subquery should not be a subquery and the second subquery has a LEFT JOIN that should be an INNER JOIN, but could also be written without a subquery at all. The second query also has a LEFT JOIN that is actualy an INNER JOIN, check the WHERE condition.

SELECT eid FROM t_entidades WHERE entidade_t LIKE '%cartography%' UNION SELECT entidade as eid FROM t_entidade_actividade ea INNER JOIN t_actividades a ON a. Aid = ea. Actividade WHERE a.

Actividade LIKE '%cartography%' And do you have indexes on the columns aid and actividade?

Indeed, the first sub-query is completely unnecessary. Your index suggestion for actividade was a good one. Forgot to create one, the difference however isn't that big.

I believe I don't need to create indexes for primary keys (aid), is that right? – andre matos Mar 3 at 10:41.

You have joins that really are unnecessary. I've come to use the rule of thumb that if I'm not actually using a field as part of the returned set, I try to use EXISTS tests instead of JOINING. Something like: SELECT te.

Eid FROM t_entidades AS te WHERE te. Entidade_t LIKE '%cartography%' OR EXISTS ( SELECT 1 FROM t_entidade_actividade AS ea WHERE ea. Entidade = te.

Eid AND EXISTS ( SELECT 1 FROM t_actividades AS ta WHERE ta. Aid = ea. Actividade AND ta.

Actividade LIKE '%cartography%' ) ).

Seems a nice way of avoiding JOINS but also makes queries a bit cumbersome... Thanks for your effort anyway. – andre matos Mar 3 at 10:40 It's really not so much about avoiding JOINS, it's more about the query analyzer and final query plan. Although the query syntax is a bit cumbersome, your original complaint seems to have been performance based.

The above query should run more efficiently and quickly. Each database query analyzer works a bit differently, but EXISTS tests are usually much quicker than JOINS when you are trying to elimate results from the results set. – GunnerL3510 Mar 3 at 21:06 Thanks for the explain on EXISTS, didn't knew they are faster than JOINS.. About your query, it does perform better than my #1 but for smaller filters such as %car% the UNION (#2) is still faster.

– andre matos Mar 4 at 9:42.

Each of the two sets of IDs are produced in quite different ways, so put them in separate queries and use a UNION to merge them together at the end. It also cuts out the useless outer scan of t_entidades in query 1 that just passes through IDs from the where clause. (Not that it's relevant to PostgreSQL, but it also makes it clear that the two scans could be run in parallel and then merged, but never mind).

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Different queries, same result (it seems), completly different performance… Why?

Related Questions

What is the relative performance of SPARQL queries vs native relational queries?

If I analyse a position which has already been analysed, but take more time now, so that the result would be better, is the better result automatically stored, overwriting the lesser result?

JOIN queries vs multiple queries?

Yes, flashback queries can be performed using TopLink's historical query support. Historical queries are described in the answer to the question: How do I query historical data?

For LevelDB, how can I get the performance of random writes as same as claimed “official” performance report?

I want to completly restore my computer from scratch. I don't have a reboot disc anymore. all I have is a disc I used to update xp to vista?