Options

Help needed to improve RANGE JOIN Operation

 EXPLAIN
SELECT
str.Parameter_name
,trm.Week_Number
,SUM(CASE WHEN trm.Period_Ind = 'CURR' THEN COALESCE(str.Sales_Usd_Val ,0) END) as CY_Sales_Usd_Val
,SUM(CASE WHEN trm.Period_Ind = 'PREV' THEN COALESCE(str.Sales_Usd_Val ,0) END) as PY_Sales_Usd_Val
,SUM(CASE WHEN trm.Period_Ind = 'PPREV' THEN COALESCE(str.Sales_Usd_Val ,0) END) as PPY_Sales_Usd_Val
,SUM(CASE WHEN trm.Period_Ind = 'CURR' THEN COALESCE(str.Sales_Local_Val,0) END) as CY_Sales_Local_Val
,SUM(CASE WHEN trm.Period_Ind = 'PREV' THEN COALESCE(str.Sales_Local_Val,0) END) as PY_Sales_Local_Val
,SUM(CASE WHEN trm.Period_Ind = 'PPREV' THEN COALESCE(str.Sales_Local_Val,0) END) as PPY_Sales_Local_Val
FROM
BILLINGS str
INNER JOIN
Data_Range_dim trm
ON str.Report_Date BETWEEN trm.Start_Dt and trm.End_Dt
and str.Site_cd = trm.Site_cd
WHERE
trm.Report_Date = '2015-11-20'
and trm.Timeframe = 'QTD'
and str.Metric_type = 'Sales'
and trm.Status_Indicator = 'T'
and str.Region_cd IN ( 'WW' ) /* Dynamic Geo Column selection */
and str.Channel_Cd IN ( 'ALL' ) /* Dynamic Segment Column selection */
and str.Prod_Type = 'ALL'

and ((str.Parameter_name IN ('TOTAL','STORE','PPU','Traffic','Transaction') )
OR
(str.Parameter_name NOT IN ('TOTAL','STORE','PPU','Traffic','Transaction','Forecast') AND trm.Week_Number = 'TOTAL')
)
GROUP BY
1,2;

Access Path:
+-GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 271K, Rows: 1K] (PATH ID: 1)
| Aggregates: sum(CASE WHEN (trm.Period_Ind = 'CURR') THEN coalesce(str.Sales_Usd_Val, 0) ELSE NULL::numeric(1,0) END), sum(CASE WHEN (trm.Period_Ind = 'PREV') THEN coalesce(
str.Sales_Usd_Val, 0) ELSE NULL::numeric(1,0) END), sum(CASE WHEN (trm.Period_Ind = 'PPREV') THEN coalesce(str.Sales_Usd_Val, 0) ELSE NULL::numeric(1,0) END), sum(CASE WHEN (tr
m.Period_Ind = 'CURR') THEN coalesce(str.Sales_Local_Val, 0) ELSE NULL::numeric(1,0) END), sum(CASE WHEN (trm.Period_Ind = 'PREV') THEN coalesce(str.Sales_Local_Val, 0) ELSE NU
LL::numeric(1,0) END), sum(CASE WHEN (trm.Period_Ind = 'PPREV') THEN coalesce(str.Sales_Local_Val, 0) ELSE NULL::numeric(1,0) END)
| Group By: str.Parameter_name, trm.Week_Number
| Execute on: All Nodes
| +---> JOIN (RANGE JOIN) [Cost: 54K, Rows: 24M] (PATH ID: 2) Inner (BROADCAST)
| | Join Cond: (str.Site_cd = (trm.Site_cd))
| | Join Filter: (str.Report_Date >= trm.Start_Dt) AND (str.Report_Date <= trm.End_Dt) AND ((str.Parameter_name = ANY (ARRAY['TOTAL', 'STORE', 'PPU', 'Traffic', 'Transaction']))
OR ((str.Parameter_name <> ALL (ARRAY['TOTAL', 'STORE', 'PPU', 'Traffic', 'Transaction', 'Forecast'])) AND (trm.Week_Number = 'TOTAL')))
| | Materialize at Output: str.Sales_Usd_Val, str.Sales_Local_Val
| | Execute on: All Nodes
| | +-- Outer -> STORAGE ACCESS for str [Cost: 17K, Rows: 24M] (PATH ID: 3)
| | | Projection: BILLINGS_DBD_1_seg_DBD_b0_b0
| | | Materialize: str.Report_Date, str.Site_cd, str.Parameter_name
| | | Filter: (str.Metric_type = 'Sales')
| | | Filter: (str.Prod_Type = 'ALL')
| | | Filter: (str.Region_cd = 'WW')
| | | Filter: (str.Channel_Cd = 'ALL')
| | | Execute on: All Nodes
| | | Runtime Filter: (SIP1(HashJoin): str.Site_cd)
| | +-- Inner -> STORAGE ACCESS for trm [Cost: 369, Rows: 30K] (PATH ID: 4)
| | | Projection: Data_Range_dim_DBD_2_seg_DBD_b0_b0
| | | Materialize: trm.Period_Ind, trm.Week_Number, trm.Site_cd, trm.Start_Dt, trm.End_Dt
| | | Filter: (trm.Report_Date = '2015-11-20'::date)
| | | Filter: (trm.Status_Indicator = 'T')
| | | Filter: (trm.Timeframe = 'QTD')
| | | Execute on: All Nodes

From query_profile table, it is understood that Path 1 (Global Segment) & Path 2 (Join) are causing performance issue. This query is taking around 800 ms which needs to be further optimized as it's being used as a sub query in another query.

Note: I deployed incremental design (DBD suggested) but still no luck.

 

Looking forward for your suggestions.

Comments

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file