I removed a node from the database and now my query runs faster. Why?
Hi folks. I'm trying a little scaling experiment where I'm removing a node to see how a test query's performance changes. When I removed a node from the database (bringing the DB Size (#Nodes) from 4 to 3, the query actually runs /faster/ than with 4: Instead of taking ~8 minutes it now runs in ~5. The query does lots of joins. I'd really appreciate someone explaining how this is possible. I've posted the query plus the two different PROFILE outputs at https://gist.github.com/anonymous/9487818 . I've also included the query below. Thank you.
SELECT relVarTable0.id AS id, relVarTable1.val, relVarTable2.val
FROM (SELECT id FROM rates) relVarTable0
LEFT JOIN
(SELECT rates1.id AS id, AVG(rates4.rating) AS val
FROM rates rates1, movie movie1, rates rates2, ml_user ml_user1, rates rates3, movie movie2, rates rates4
WHERE movie1.id = rates1.movie_id AND movie1.id = rates2.movie_id AND ml_user1.id = rates2.ml_user_id AND ml_user1.id = rates3.ml_user_id AND movie2.id = rates3.movie_id AND movie2.id = rates4.movie_id AND movie1.id <> movie2.id AND rates1.id <> rates2.id AND rates2.id <> rates3.id AND rates3.id <> rates4.id AND rates4.rating IS NOT NULL
GROUP BY rates1.id) relVarTable1
ON relVarTable0.id = relVarTable1.id
LEFT JOIN
(SELECT rates1.id AS id, rates1.rating AS val
FROM rates rates1
WHERE rates1.rating IS NOT NULL ) relVarTable2
ON relVarTable0.id = relVarTable2.id;
0
Comments
I don't immediately know the answer, but, in the addition to PROFILE, could you post the output of EXPLAIN for this query?
Also, have you run the database designer against these tables and this or a similar query? And "analyze_statistics('')", to help the optimizer pick the best JOIN plan?
One possible cause of this issue is that these tables have a very sub-optimal projection design, to the point where we give up and decide that the best way to proceed is to just pull all of the data onto the initiator node and run the query as a single-machine query. (If you have fewer machines, then more data is already local on any given machine, so, more is on the initiator.)
This is uncommon, and your query doesn't immediately look to me like one that would produce such a plan (though it's complex enough that I can't tell for sure at a quick glance). We can tell for sure by asking Vertica what it's doing, ie., EXPLAIN. The database designer will fix this sort of issue automatically.
Adam
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN
SELECT relVarTable0.id AS id, relVarTable1.val, relVarTable2.val
FROM (SELECT id FROM rates) relVarTable0
LEFT JOIN
(SELECT rates1.id AS id, AVG(rates4.rating) AS val
FROM rates rates1, movie movie1, rates rates2, ml_user ml_user1, rates rates3, movie movie2, rates rates4
WHERE movie1.id = rates1.movie_id AND movie1.id = rates2.movie_id AND ml_user1.id = rates2.ml_user_id AND ml_user1.id = rates3.ml_user_id AND movie2.id = rates3.movie_id AND movie2.id = rates4.movie_id AND movie1.id <> movie2.id AND rates1.id <> rates2.id AND rates2.id <> rates3.id AND rates3.id <> rates4.id AND rates4.rating IS NOT NULL
GROUP BY rates1.id) relVarTable1
ON relVarTable0.id = relVarTable1.id
LEFT JOIN
(SELECT rates1.id AS id, rates1.rating AS val
FROM rates rates1
WHERE rates1.rating IS NOT NULL ) relVarTable2
ON relVarTable0.id = relVarTable2.id;
Access Path:
+-JOIN MERGEJOIN(inputs presorted) [LeftOuter] [Cost: 6K, Rows: 76K (NO STATISTICS)] (PATH ID: 1) Inner (RESEGMENT)
| Join Cond: (relVarTable0.id = relVarTable2.id)
| Execute on: All Nodes
| +-- Outer -> JOIN MERGEJOIN(inputs presorted) [LeftOuter] [Cost: 6K, Rows: 76K (NO STATISTICS)] (PATH ID: 2) Outer (RESEGMENT)
| | Join Cond: (relVarTable0.id = relVarTable1.id)
| | Execute on: All Nodes
| | +-- Outer -> SELECT [Cost: 26, Rows: 76K (NO STATISTICS)] (PATH ID: 3)
| | | Execute on: All Nodes
| | | +---> STORAGE ACCESS for rates [Cost: 26, Rows: 76K (NO STATISTICS)] (PATH ID: 4)
| | | | Projection: movielens_test.rates_b0
| | | | Materialize: rates.id
| | | | Execute on: All Nodes
| | +-- Inner -> SELECT [Cost: 5K, Rows: 10K (NO STATISTICS)] (PATH ID: 5)
| | | Execute on: All Nodes
| | | +---> GROUPBY HASH (SORT OUTPUT) (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 5K, Rows: 10K (NO STATISTICS)] (PATH ID: 6)
| | | | Aggregates: sum_float(<SVAR>), count(<SVAR>)
| | | | Group By: rates1.id
| | | | Execute on: All Nodes
| | | | +---> JOIN HASH [Cost: 4K, Rows: 76K (NO STATISTICS)] (PATH ID: 7) Outer (BROADCAST)(LOCAL ROUND ROBIN)
| | | | | Join Cond: (movie2.id = rates4.movie_id)
| | | | | Join Filter: (rates3.id <> rates4.id)
| | | | | Execute on: All Nodes
| | | | | +-- Outer -> STORAGE ACCESS for rates4 [Cost: 80, Rows: 76K (NO STATISTICS)] (PUSHED GROUPING) (PATH ID: 8)
| | | | | | Projection: movielens_test.rates_b0
| | | | | | Materialize: rates4.rating, rates4.id, rates4.movie_id
| | | | | | Filter: (rates4.rating IS NOT NULL)
| | | | | | Execute on: All Nodes
| | | | | +-- Inner -> JOIN HASH [Cost: 2K, Rows: 76K (NO STATISTICS)] (PATH ID: 9) Outer (LOCAL ROUND ROBIN) Inner (BROADCAST)
| | | | | | Join Cond: (movie2.id = rates3.movie_id)
| | | | | | Join Filter: (movie1.id <> movie2.id)
| | | | | | Execute on: All Nodes
| | | | | | +-- Outer -> JOIN HASH [Cost: 2K, Rows: 76K (NO STATISTICS)] (PATH ID: 10) Inner (RESEGMENT)
| | | | | | | Join Cond: (ml_user1.id = rates3.ml_user_id)
| | | | | | | Join Filter: (rates2.id <> rates3.id)
| | | | | | | Execute on: All Nodes
| | | | | | | Runtime Filter: (SIP1(HashJoin): rates3.movie_id)
| | | | | | | +-- Outer -> JOIN HASH [Cost: 2K, Rows: 76K (NO STATISTICS)] (PATH ID: 11) Outer (RESEGMENT)(LOCAL ROUND ROBIN) Inner (RESEGMENT)
| | | | | | | | Join Cond: (ml_user1.id = rates2.ml_user_id)
| | | | | | | | Execute on: All Nodes
| | | | | | | | Runtime Filter: (SIP2(HashJoin): ml_user1.id)
| | | | | | | | +-- Outer -> JOIN HASH [Cost: 949, Rows: 76K (NO STATISTICS)] (PATH ID: 12) Outer (RESEGMENT)(LOCAL ROUND ROBIN)
| | | | | | | | | Join Cond: (movie1.id = rates2.movie_id)
| | | | | | | | | Join Filter: (rates1.id <> rates2.id)
| | | | | | | | | Execute on: All Nodes
| | | | | | | | | +-- Outer -> STORAGE ACCESS for rates2 [Cost: 78, Rows: 76K (NO STATISTICS)] (PATH ID: 13)
| | | | | | | | | | Projection: movielens_test.rates_b0
| | | | | | | | | | Materialize: rates2.id, rates2.ml_user_id, rates2.movie_id
| | | | | | | | | | Execute on: All Nodes
| | | | | | | | | +-- Inner -> JOIN HASH [Cost: 359, Rows: 76K (NO STATISTICS)] (PATH ID: 14) Outer (RESEGMENT)(LOCAL ROUND ROBIN) Inner (RESEGMENT)
| | | | | | | | | | Join Cond: (movie1.id = rates1.movie_id)
| | | | | | | | | | Execute on: All Nodes
| | | | | | | | | | +-- Outer -> STORAGE ACCESS for rates1 [Cost: 52, Rows: 76K (NO STATISTICS)] (PATH ID: 15)
| | | | | | | | | | | Projection: movielens_test.rates_b0
| | | | | | | | | | | Materialize: rates1.id, rates1.movie_id
| | | | | | | | | | | Execute on: All Nodes
| | | | | | | | | | +-- Inner -> STORAGE ACCESS for movie1 [Cost: 9, Rows: 2K (NO STATISTICS)] (PATH ID: 16)
| | | | | | | | | | | Projection: movielens_test.movie_b0
| | | | | | | | | | | Materialize: movie1.id
| | | | | | | | | | | Execute on: All Nodes
| | | | | | | | +-- Inner -> STORAGE ACCESS for ml_user1 [Cost: 9, Rows: 6K (NO STATISTICS)] (PATH ID: 17)
| | | | | | | | | Projection: movielens_test.ml_user_b0
| | | | | | | | | Materialize: ml_user1.id
| | | | | | | | | Execute on: All Nodes
| | | | | | | +-- Inner -> STORAGE ACCESS for rates3 [Cost: 78, Rows: 76K (NO STATISTICS)] (PATH ID: 18)
| | | | | | | | Projection: movielens_test.rates_b0
| | | | | | | | Materialize: rates3.id, rates3.ml_user_id, rates3.movie_id
| | | | | | | | Execute on: All Nodes
| | | | | | +-- Inner -> STORAGE ACCESS for movie2 [Cost: 9, Rows: 2K (NO STATISTICS)] (PATH ID: 19)
| | | | | | | Projection: movielens_test.movie_b0
| | | | | | | Materialize: movie2.id
| | | | | | | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 54, Rows: 76K (NO STATISTICS)] (PATH ID: 20)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for rates1 [Cost: 54, Rows: 76K (NO STATISTICS)] (PATH ID: 21)
| | | Projection: movielens_test.rates_b0
| | | Materialize: rates1.rating, rates1.id
| | | Filter: (rates1.rating IS NOT NULL)
| | | Execute on: All Nodes
------------------------------