Options

Joining 3 tables efficiently

I am trying to join 3 tables in the same query. 1 larger fact table and 2 smaller dimension tables.
However, what I find is that the query uses up all the memory, as the first join between 1 dimension table (mlog.tmp_stg_xmode_dids_lookup_complete_b0) and the fact (dev.stg_xmode_log_b0) table will become the inner part of the next hash join, which then doesn't fit in the memory.
Is there a way to hint an optimizer that the result of the first join should be the OUTER part of the 2nd join, and user the 3rd table as the inner (mlog.d_user_agent_hash_user_agent40) ?

Thanks for any suggestions. Note that there are no statistics on the tables as they are populated as part of the transaction and wouldn't want to commit the transaction yet (ANALYZE_STATISTICS commits a transaction)

Access Path:
+-JOIN HASH [RightOuter] [Cost: 4M, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Join Cond: (e.hash_user_agent40 = a.hash_user_agent40) AND (e.user_agent = a.user_agent)
| Materialize at Input: e.hash_did8, e.did, e.country, e.region, e.city, e.device_type, e.ts, e.ip, e.lat, e.lon, e.haccuracy, e.region2, e.zip_code, e.user_agent, e.hash_user_agent40, e.publisher_id
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for a [Cost: 355K, Rows: 4M] (PATH ID: 2)
| | Projection: mlog.d_user_agent_hash_user_agent40
| | Materialize: a.hash_user_agent40, a.user_agent, a.ua_id
| | Execute on: All Nodes
| | Runtime Filters: (SIP1(HashJoin): a.hash_user_agent40), (SIP2(HashJoin): a.user_agent), (SIP3(HashJoin): a.hash_user_agent40, a.user_agent)
| +-- Inner -> JOIN MERGEJOIN(inputs presorted) [Cost: 4M, Rows: 1 (NO STATISTICS)] (PATH ID: 3)
| | Join Cond: (e.hash_did8 = d.hash_did8) AND (e.did = d.did)
| | Execute on: All Nodes
| | +-- Outer -> STORAGE ACCESS for e [Cost: 3M, Rows: 1B] (PATH ID: 4)
| | | Projection: dev.stg_xmode_log_b0
| | | Materialize: e.hash_did8, e.did
| | | Filter: ((e.did IS NOT NULL) AND (e.did <> '00000000-0000-0000-0000-000000000000'))
| | | Filter: ((e.ts IS NOT NULL) AND (e.ts <= '2018-10-31 00:32:29.36579'::timestamp))
| | | Filter: ((e.lat >= (-90)) AND (e.lat <= 90))
| | | Filter: ((e.lon >= (-180)) AND (e.lon <= 180))
| | | Execute on: All Nodes
| | | Runtime Filters: (SIP4(MergeJoin): e.hash_did8), (SIP5(MergeJoin): e.did), (SIP6(MergeJoin): e.hash_did8, e.did)
| | +-- Inner -> STORAGE ACCESS for d [Cost: 1, Rows: 1 (NO STATISTICS)] (PATH ID: 5)
| | | Projection: mlog.tmp_stg_xmode_dids_lookup_complete_b0
| | | Materialize: d.did, d.hash_did8, d.device_id
| | | Filter: ((d.did IS NOT NULL) AND (d.did <> '00000000-0000-0000-0000-000000000000'))
| | | Execute on: All Nodes

Comments

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file