Is the outer join done in multiple thread?
I am using the following SQL to check
Select operator_name,count(*) from v_monitor.execution_engine_profiles where transaction_id=XXX
and statement_id=1
and counter_name ='clock time (us)' group by operator_name
When use inner join (hash), the result is:
GroupByPipe | 240 |
Root | 1 |
NewEENode | 6 |
ExprEval | 240 |
GroupByHash | 240 |
ParallelUnion | 130 |
Scan | 245 |
Join | 240 |
But if use left join, the result is:
Root | 1 |
NewEENode | 6 |
ExprEval | 5 |
GroupByPipe | 10 |
Scan | 245 |
Join | 5 |
Filter | 5 |
Sort | 5 |
Consider I have a cluster of 5 nodes, so looks like vertica can only sue one thread per node for outer join but use multiple threads each node for inner join.
I have SQL wich has many left join and the performane is very bad comparing with sql server, sql server could do outer join in multiple thread mode, why vertica can't use multiple threads for outer join?
0
Comments
Can you add the query ( or a simplified version of it ) ? Also, can you check explain local verbose SQLQUERY ? It shows worker thread information for the query. Try it for both inner and outer join query. Publish the results if you can. It would be useful to know if the operation is singlethreaded or not and whether it could be made multithreaded. My expectation is that left joins are no different than inner joins and could be run multithreaded with multiple threads splitting the left side of the left outer join.
The query is:
select a.store_key,count(*) from PNG_WALGRN_JOHN.SPD_FACT_PIVOT a left join PNG_WALGRN_JOHN.SPD_FACT_PIVOT b on a.item_key=b.store_key
where a.period_key=20150713
group by a.store_key;
That's a left join, it's run in parrallel of join on each ndoe, but if replace left join with inner jion, then it's run in single thread in in node.
Attached is the local verbose plan.
I have provided the explain file, could anyone help? Why inner join could run in parallel inside a node, but left join can't