How to force optimizer to broadcast table and use as inner source in hash join
Hi,
I have user SQL that rather stubbornly stick to same execution plan, despite my attempts to change it (using only documented hints).
It is effectively
select XXXXXX
from A join B join C
XXXXX;
First, A and B resegmented and joined. It is possible to change what will be inner and outer source in hash join for this stage, using hint.
Then, output of A and B join resegmented with C, and joined. C is inner source, A and B is outer. And, it is not possible to change what would be inner and what would be outer source (despite frantic hints manipulations).
A is rather small, B and C are huge. Result of join A and B is small.
That is very poor execution plan, optimizer is not smart enough (Vertica optimizer developers are smart but way too few of them).
What I want:
1) Broadcast A
2) Join with B locally, and broadcasted A should be inner source in hash join
3) Broadcast result to join
4) Use broadcasted join result as inner source in hash join with C.
Prime point, should be no resegmentation of B and C. As those are humongous.
What I cannot accomplish:
1) I cannot force result of join to be broadcasted.
2) I cannot force result of previous join to be used as inner source in hash join.
3) I cannot avoid resegmentation of B and C.
Can you give some hints how to do it?
Thanks in advance.
Answers
Hello Sergey_Cherepan_1 ,
I think something like this could probably help:
select /+syn_join/ xxx
from C join /+distrib(A,B), jtype(H)/ (B join /+distrib(L,B), jtype(H)/ A on xxx) on xxx
You could find more information about those hints here https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/AdministratorsGuide/QueryManagement/DirectedQueries/AnnotateQueries.htm?zoom_highlight=hints
I hope this helps,
Thao
Hi Sergey_Cherepan_1,
I think something like this could probably give the plan you want: (the asterisks needed for the hint below got dropped and I am not sure how to keep it)
Select / +syn_join / xxx
From C join /+distrib(A,B), jtype(H)/ (B join /+distrib(L,B), jtype(H)/ A on xxx) on xxx;
Note that we need the syn_join (or synctatic_join) hint at the beginning of the query in order for the remaining join hints to work. More information about those hints can be found here: https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/AdministratorsGuide/QueryManagement/DirectedQueries/AnnotateQueries.htm?zoom_highlight=hints
I hope this helps.