How to avoid flattening of subqueries in WHERE clauses?
rikusg
✭
Hi there,
I have a question regarding the way that the query planner flattens subqueries in WHERE clauses to one or more JOINs.
Here is an example table:
CREATE TABLE geo(
uuid int NOT NULL,
type int NOT NULL,
cid int
)
ORDER BY geo.type, geo.cid
SEGMENTED BY hash(geo.uuid) ALL NODES KSAFE 1;
dbadmin=> SELECT type, count(*) FROM geo GROUP BY type;
type | count
------+------------
0 | 1500000000
1 | 1073741824
-1 | 426258176
(3 rows)
Let's filter using a highly selective subquery in a WHERE clause with an OR:
SELECT count(*) FROM geo
WHERE type=-1
OR type=1 AND cid IN (SELECT 2223); --Illustrative highly selective subquery
count
-----------
426258177
(1 row)
Time: First fetch (1 row): 6954.000 ms. All rows formatted: 6990.000 ms
This is suprising slow. Let's inline the subquery's result:
SELECT count(*) FROM geo
WHERE type=-1
OR type=1 AND cid IN (2223); --Inline the subquery's results
count
-----------
426258177
(1 row)
Time: First fetch (1 row): 329.000 ms. All rows formatted: 367.000 ms
Much better!
How about adding another OR with an additional subquery?
SELECT count(*) FROM geo
WHERE type=-1
OR type=1 AND cid IN (SELECT 2223) --Illustrative highly selective subquery
OR type=0 AND cid IN (SELECT 1038); --Illustrative highly selective subquery
count
-----------
426258178
(1 row)
Time: First fetch (1 row): 15351.000 ms. All rows formatted: 15388.000 ms
Now it is even slower. So what is going on here? Let's look at the query plan:
Access Path:
+-GROUPBY NOTHING [Cost: 3M, Rows: 1] (PATH ID: 1)
| Aggregates: count(*)
| Execute on: All Nodes
| +---> JOIN HASH [Semi] [Cost: 3M, Rows: 750M] (PATH ID: 2) Inner (BROADCAST)
| | Join Cond: (geo.cid = VAL(3))
| | Materialize at Input: geo.cid
| | Materialize at Output: geo.type
| | Filter: ((geo.type = (-1)) OR ((geo.type = 1) AND VAL(2)) OR ((geo.type = 0) AND VAL(3)))
| | Execute on: All Nodes
| | +-- Outer -> JOIN HASH [Semi] [Cost: 2M, Rows: 2B] (PATH ID: 3) Inner (BROADCAST)
| | | Join Cond: (geo.cid = VAL(2))
| | | Execute on: All Nodes
| | | +-- Outer -> STORAGE ACCESS for geo [Cost: 1M, Rows: 3B] (PATH ID: 4)
| | | | Projection: public.geop_b0
| | | | Materialize: geo.cid
| | | | Execute on: All Nodes
| | | +-- Inner -> SELECT [Cost: 10, Rows: 1] (PATH ID: 5)
| | | | Execute on: Query Initiator
| | | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 6)
| | | | | Projection: v_catalog.dual_p
| | | | | Execute on: Query Initiator
| | +-- Inner -> SELECT [Cost: 10, Rows: 1] (PATH ID: 7)
| | | Execute on: Query Initiator
| | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 8)
| | | | Projection: v_catalog.dual_p
| | | | Execute on: Query Initiator
It looks like it flattened the ORs into a configuration of JOINs, including a BROADCAST step.
Let's look at the query plan for an inline case:
Access Path:
+-GROUPBY NOTHING [Cost: 3M, Rows: 1] (PATH ID: 1)
| Aggregates: count(*)
| Execute on: All Nodes
| +---> STORAGE ACCESS for geo [Cost: 3M, Rows: 1B] (PATH ID: 2)
| | Projection: public.geop_b0
| | Materialize: geo.type, geo.cid
| | Filter: ((geo.type = (-1)) OR ((geo.type = 1) AND (geo.cid = 2223)) OR ((geo.type = 0) AND (geo.cid = 1038)))
| | Execute on: All Nodes
Much better!
So the question is how to avoid this flattening behavior as it is clearly not a good idea when selective subqueries are used?
Any help much appreciated!
0
Comments
Great question, I just found the same issue. Getting an answer would be great.
Do you have "SELECT 2223" as subquery in your real application or it is actually a query that can return set of values. This specific example can be optimized but I guess your real use case could be different.
I can open enhancement request for this specific example.
I think a common use case would be keeping offsets in a table for processing rows. These would often be timestamps or integers. You would think want to query the offset table to build the next where clause.
Possible feature enhancements:
So we managed to get past this with a really ugly hack. Our java code creates a scaler function that the query uses. This has a huge impact on performance. Some of our ETL jobs moved from 3 hours down to 20 minutes after this.