The explain for upper and non-Upperb are different and execution times vary widely
hi guys,
In vertica 9.2, Which I use now, I encounter the following problem. When UPPERB is used, the execution result is completed in 3s, but I don't use UPPERB, which needs 50s to complete. Here are SQL and EXPLAIN
use UPPERB (Take 3 s)select sum( ext_reseller_price) sourceExtRsPrice ,sum(ext_eu_price) sourceExtCustomerPrice ,sum(ext_msrp) sourceExtMsrp from dw_us.dws_stellr_consumption_azure_usage_1d consume where date_flag between to_date(to_char(current_date, 'YYYY-MM-01'), 'YYYY-MM-DD') and current_date and UPPERB(consume.resource_uri) in ( select UPPERB(resource_uri) from dm_us.dm_stellr_azure_filter_view fv where 1 = 1 and reseller_no=123456 group by UPPERB(resource_uri) )
+-GROUPBY NOTHING [Cost: 172M, Rows: 1] (PATH ID: 1)
Aggregates: sum(consume.ext_reseller_price), sum(consume.ext_eu_price), sum(consume.ext_msrp) Execute on: All Nodes
+---> JOIN HASH [Semi] [Cost: 171M, Rows: 590K] (PATH ID: 2) Inner (BROADCAST)
Join Cond: (upperb(consume.resource_uri) = VAL(2)) Materialize at Output: consume.ext_msrp, consume.ext_reseller_price, consume.ext_eu_price Execute on: All Nodes
+-- Outer -> STORAGE ACCESS for consume [Cost: 13M, Rows: 1M] (PATH ID: 3)
Projection: dw_us.dws_stellr_consumption_azure_usage_1d_DBD_1_seg_test2_b0 Materialize: consume.resource_uri Filter: ((consume.date_flag >= '2020-06-01'::date) AND (consume.date_flag <= '2020-06-26'::date)) Execute on: All Nodes Runtime Filter: (SIP1(HashJoin): upperb(consume.resource_uri))
+---> GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 41M, Rows: 700K] (PATH ID: 5)
Group By: upperb(fv.resource_uri) Execute on: All Nodes
+---> STORAGE ACCESS for fv [Cost: 2M, Rows: 700K] (PATH ID: 6)
Projection: dm_us.dm_stellr_azure_filter_view_DBD_2_seg_test2_b0 Materialize: fv.resource_uri Filter: (fv.reseller_no = 123456) Execute on: All Nodes
no use UPPERB(Take 50s)
select sum( ext_reseller_price) sourceExtRsPrice
,sum(ext_eu_price) sourceExtCustomerPrice
,sum(ext_msrp) sourceExtMsrp
from dw_us.dws_stellr_consumption_azure_usage_1d consume
where date_flag between to_date(to_char(current_date, 'YYYY-MM-01'), 'YYYY-MM-DD') and current_date
and consume.resource_uri in (
select resource_uri
from dm_us.dm_stellr_azure_filter_view fv
where 1 = 1 and reseller_no=123456
group by resource_uri
)
+-GROUPBY NOTHING [Cost: 144M, Rows: 1] (PATH ID: 1)
Aggregates: sum(), sum(), sum() Execute on: All Nodes
+---> JOIN HASH [Semi] [Cost: 143M, Rows: 1M] (PATH ID: 2) Outer (RESEGMENT)(LOCAL ROUND ROBIN)
Join Cond: (consume.resource_uri = VAL(2)) Execute on: All Nodes
+-- Outer -> STORAGE ACCESS for consume [Cost: 13M, Rows: 1M] (PUSHED GROUPING) (PATH ID: 3)
Projection: dw_us.dws_stellr_consumption_azure_usage_1d_DBD_1_seg_test2_b0 Materialize: consume.ext_eu_price, consume.resource_uri, consume.ext_msrp, consume.ext_reseller_price Filter: ((consume.date_flag >= '2020-06-01'::date) AND (consume.date_flag <= '2020-06-26'::date)) Execute on: All Nodes
+---> GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 41M, Rows: 700K] (PATH ID: 5)
Group By: fv.resource_uri Execute on: All Nodes
+---> STORAGE ACCESS for fv [Cost: 2M, Rows: 700K] (PATH ID: 6)
Projection: dm_us.dm_stellr_azure_filter_view_DBD_2_seg_test2_b0 Materialize: fv.resource_uri Filter: (fv.reseller_no = 123456) Execute on: All Nodes
Answers
I think the following differences affected performance.
For 'use UPPERB', Runtime Filter appeared and it could reduce the number of rows by this filter when accessing the table. It might be able to reduce the disk I/O and could reduce the data before processing the join operator. The optimizer chose RESEGMENT for the join operator in case of no use UPPERB, but chose BROADCAST in case of use UPPERB. It didn't choose RESEGMENT because the function (UPPERB) was used as the join key. But, on the other hand, Sidewise Information Passing (SIP) is available when choosing BROADCAST, not available when choosing RESEGMENT.
As a result, in your case, BROADCAST + SIP was faster than RESEGMENT.