Customized Projections
I have run DBD and deployed it's recommendations . However, i'm working on creating customized projections to see if my queries could get extra performance gain. I find customized projections to be ultrafast provided i access them directly (not referring to base table)but i don't know how to force optimizer to access customized projections created by me when i directly refer to base table. Any pointers on tips to make optimizer use customized projections? thanks for your help in advance.
0
Comments
Below is an undocumented method , do not use it in production without support guideline :
Thanks .SELECT set_optimizer_directives('AvoidUsingProjections=proj1,proj2'); < disable projection
SELECT set_optimizer_directives('AvoidUsingProjections='); <-Enable it back
like
SELECT set_optimizer_directives('UseOnlyProjections=<your projection name>');
A request, if I may: We are constantly trying to improve our DBD to create optimal projections for your queries, and our Optimizer to always choose the right projection. Usually both do. If you are consistently able to do much better by hand, we would really appreciate it if you're willing to post an example so that we can figure out what our implementation could be doing better.
Adam
@Adam: I'm sure DBD is the best bet to create optimal projections . However, as mentioned in documentation, we can create optimized design by modifying or creating the design created by DBD. As i review DBD projections, i thought i should try applying some of the basic tuning principles such as GROUP HASH --> GROUP PIPELINED , Avoiding Data Re-segmentation for GROUP BY queries & Sorting by Low Cardinality to High Cardinality columns etc.
-- Sample SQL query:
Select Prod_Tier_Cd As Prod_Tier_Cd ,
Sum(Total_Store_Inventory_Qty) As Total_Store_Inventory_Qty ,
Sum(Total_Hub_Inventory_Qty) As Total_Hub_Inventory_Qty ,
Sum(In_Transit_Hub_Unit_Qty) As In_Transit_Hub_Unit_Qty ,
Sum(In_Transit_Store_Unit_Qty) As In_Transit_Store_Unit_Qty ,
Sum(Avg_Units_On_Hand_Qty) As Avg_Units_On_Hand_Qty ,
Sum(Prod_For_Sale_Prev_2_Week_Days) As Prod_For_Sale_Prev_2_Week_Days,
Sum(Day_1_To_7_Sales_Qty) As Day_1_To_7_Sales_Qty
FROM TABLEA
WHERE
Rptg_Dt='2013-05-13' AND
Region_Name='APAC' AND
Country_Name='APAC' AND
Prod_Categ_Cd IN ('XYZ','ABC')
GROUP BY Prod_Tier_Cd
--DBD Recommended Projection:
CREATE PROJECTION TABLEA /*+basename(TABLEA),createtype(L)*/
(
Rptg_Dt ENCODING RLE,
Country_Name ENCODING RLE,
Prod_Id ENCODING AUTO,
Prod_Categ_Cd ENCODING RLE,
Region_Name ENCODING RLE,
Metric_Categ_Name ENCODING AUTO,
Prod_Class_Name ENCODING AUTO,
Prod_Tier_Cd ENCODING AUTO,
Planner_Cd ENCODING AUTO,
Prod_Row_Id ENCODING AUTO,
Day_1_To_7_Sales_Qty ENCODING DELTARANGE_COMP
)
AS
SELECT TABLEA.Rptg_Dt,
TABLEA.Country_Name,
TABLEA.Prod_Id,
TABLEA.Prod_Categ_Cd,
TABLEA.Region_Name,
TABLEA.Metric_Categ_Name,
TABLEA.Prod_Class_Name,
TABLEA.Prod_Tier_Cd,
TABLEA.Planner_Cd,
TABLEA.Prod_Row_Id,
TABLEA.Day_1_To_7_Sales_Qty
FROM retail_ops.TABLEA
ORDER BY Rptg_Dt,
Country_Name,
Prod_Categ_Cd,
Prod_Class_Nr,
Prod_Row_Id,
Row_Order_Id,
Prod_Tier_Cd,
Prod_Id
SEGMENTED BY hash(TABLEA.Metric_Categ_Name) ALL NODES
KSAFE 1 ;
--Custom Projection with consideration of below tuning principles:
-- ALL GROUP COLUMNS IN ORDER BY CLAUSE (GROUP PIPELINED INSTEAD OF GROUP BY HASH )
-- ALL SEGMENTED COLUMNS ARE INCLUDED IN GROUP BY (TO AVOID RUN TIME RE-SEGMENTATION)
-- STATS Collected.
CREATE PROJECTION TABLEA_t /*+basename(TABLEA),createtype(L)*/
(
Rptg_Dt ENCODING RLE,
Country_Name ENCODING RLE,
Prod_Id ENCODING AUTO,
Prod_Categ_Cd ENCODING RLE,
Region_Name ENCODING RLE,
Metric_Categ_Name ENCODING AUTO,
Prod_Class_Name ENCODING AUTO,
Prod_Tier_Cd ENCODING AUTO,
Planner_Cd ENCODING AUTO,
Prod_Row_Id ENCODING AUTO,
Day_1_To_7_Sales_Qty ENCODING DELTARANGE_COMP
)
AS
SELECT TABLEA.Rptg_Dt,
TABLEA.Country_Name,
TABLEA.Prod_Id,
TABLEA.Prod_Categ_Cd,
TABLEA.Region_Name,
TABLEA.Metric_Categ_Name,
TABLEA.Prod_Class_Name,
TABLEA.Prod_Tier_Cd,
TABLEA.Planner_Cd,
TABLEA.Prod_Row_Id,
TABLEA.Day_1_To_7_Sales_Qty
FROM retail_ops.TABLEA
ORDER BY TABLEA.Prod_Tier_Cd,
TABLEA.Prod_Categ_Cd,
TABLEA.Region_Name,
TABLEA.Country_Name,
TABLEA.Rptg_Dt
SEGMENTED BY hash(TABLEA.Prod_Tier_Cd) ALL NODES
KSAFE 1 ;
-- Trace back in execution_engine_profiles
-- CLOCK TIME
--DBD Recommended
App_owner=> SELECT operator_name,path_id,SUM(counter_value) FROM V_monitor.execution_engine_profiles where transaction_id=63050394795945858 and statement_id=9 and counter_name='clock time (us)' GROUP BY 1,2 ORDER BY path_id ;
operator_name | path_id | SUM
---------------+---------+--------
NewEENode | -1 | 93
ExprEval | -1 | 100
Root | -1 | 144
ExprEval | 1 | 151
Sort | 1 | 476
ExprEval | 3 | 1591
GroupByHash | 3 | 17451
ParallelUnion | 3 | 566618
GroupByPipe | 3 | 1219
ExprEval | 4 | 313
Scan | 4 | 12623
(11 rows)
-- CUSTOM
Time: First fetch (11 rows): 1027.724 ms. All rows formatted: 1027.777 ms
App_owner=> SELECT operator_name,path_id,SUM(counter_value) FROM V_monitor.execution_engine_profiles where transaction_id=63050394795945858 and statement_id=21 and counter_name='clock time (us)' GROUP BY 1,2 ORDER BY path_id ;
operator_name | path_id | SUM
---------------+---------+--------
NewEENode | -1 | 67
ExprEval | -1 | 90
Root | -1 | 107
ExprEval | 1 | 161
Sort | 1 | 444
ExprEval | 3 | 111
GroupByPipe | 3 | 1127
ExprEval | 4 | 1200
Scan | 4 | 104084
(9 rows)
--EXECUTION TIME (CPU)
--DBD Recommended
App_owner=> SELECT operator_name,path_id,SUM(counter_value) FROM V_monitor.execution_engine_profiles where transaction_id=63050394795945858 and statement_id=9 and counter_name='execution time (us)' GROUP BY 1,2 ORDER BY path_id ;
operator_name | path_id | SUM
---------------+---------+-------
NewEENode | -1 | 71
ExprEval | -1 | 86
Root | -1 | 140
ExprEval | 1 | 147
NetworkSend | 1 | 58
Sort | 1 | 442
NetworkRecv | 1 | 244
ExprEval | 3 | 577
GroupByHash | 3 | 13809
NetworkRecv | 3 | 1297
ParallelUnion | 3 | 5797
GroupByPipe | 3 | 1154
StorageUnion | 3 | 3337
NetworkSend | 3 | 187
ExprEval | 4 | 264
Scan | 4 | 12505
(16 rows)
-- CUSTOM
Time: First fetch (16 rows): 972.249 ms. All rows formatted: 972.314 ms
App_owner=> SELECT operator_name,path_id,SUM(counter_value) FROM V_monitor.execution_engine_profiles where transaction_id=63050394795945858 and statement_id=21 and counter_name='execution time (us)' GROUP BY 1,2 ORDER BY path_id ;
operator_name | path_id | SUM
---------------+---------+-------
NewEENode | -1 | 58
ExprEval | -1 | 77
Root | -1 | 104
ExprEval | 1 | 159
NetworkSend | 1 | 47
Sort | 1 | 399
NetworkRecv | 1 | 256
ExprEval | 3 | 93
StorageMerge | 3 | 547
GroupByPipe | 3 | 536
ExprEval | 4 | 470
Scan | 4 | 72454
(12 rows)
--OBSERVATIONS:
CUSTOM has better stats than DBD recommended . However, Optimizer is showing DBD recommended/Super projection in Explain plan.
Thanks for the detailed reply. Hm... When you ran the DBD, did you provide it with this as one of its example queries? That doesn't look to me like a design that we would produce for that query.
Thanks,
Adam
Adding up the counter values, the DBD recommended profile appears to be 35,000 counters less than your custom projection.
What we really need to see is the explain plan for the query, using the DBD projection, and your custom projection.