Projection Tuning Analytical function over subquery
Hi Vertica Folks,
I have query with group by on high cardinal column in subquery and analytical function over top of it. The subquery gives subsecond response but whereas the analytical function take lot of time.
My projection has group by on high cardinal values and top in the order by and established group by pipeline.
The analytical function used in Row_number over order by on measure values desc- to pick top 100.
How to tune the query further for the olap function?
0
Comments
Hi ,
First , when you ask such question is very helpful when the query is being attached.
I see two main issues with the used of analytics functions :
1) in most of the time Analytic function do not run in parallel ( you can see it by profile your query )
2) The fact you "order by desc" required to process the entire data set (so if you have large result set ,it can take time ...) , group by pipeline is not helpful for this case .
Thanks
Hi,
This is the query which I am trying to tune.
SELECT DIMENSION,AGG
SELECT DIMENSION,AGG, ROW_NUM() OVER ( ORDER BY AGG DESC) RNK
(SELECT DIMENSION, SUM(MEASURE) AS AGG FROM TABLE WHERE FILTER1 AND FILTTER2 AND FILTER3 GROUP BY DIMENSION) AS A
WHERE A.AGG<>0) AS B WHERE B.RNK<=50;
I agree, could you please suggest me a way to tune this query through projection.
Hi David
First avoid the used of ROW_NUMBER functions , and modify you query to looks like below query :
SELECT DIMENSION, SUM(MEASURE) AS AGG
FROM TABLE
WHERE FILTER1
AND FILTTER2
AND FILTER3
GROUP BY DIMENSION HAVING SUM(MEASURE) <> 0 order by 2 desc limit 50
Make sure your projection order columns include the filters + the group by column ( if you vae hard time with that you can used DBD )
Hope you will find it useful
Thanks
Hi,
Thanks for the your input.
There is a challenge here, we cannot modify this query as do as a conventional step in tuning. This OLAP function is generated by service layer which runs on top of vertica to publish to dashboard. So, modifying existing query cannot be done unfortunately.
Also, we dont have DBD here to take a lead on performance tuning. We use our best knowledge to tune the query.
is there any other way you could suggest for me here.
Thanks.
DBD is part of vertica SW distribution , do you any kind of limitation to execute it ?
Thanks
Yes, We have limitation. We are not allowed to use it.
Hi
Vertica expose API's to the DBD , this should simplify things for you , you can Call the API's from your code . Using this method you will need to access directly the vertica server .
I hope this could help
Thanks