Projection Tuning Analytical function over subquery

Hi Vertica Folks,

 

  I have query with group by on high cardinal column in subquery and analytical function over top of it. The subquery gives subsecond response but whereas the analytical function take lot of time.

 

My projection has group by on high cardinal values and top in the order by and established group by pipeline.

 

The analytical function used in Row_number over order by on measure values desc-  to pick top 100.

 

How to tune the query further for the olap function? 

 

Comments

  • Hi ,

    First , when you ask such question is very helpful when the query is being attached.

     

    I see two main issues  with the used of analytics functions :

     

    1) in most of the time Analytic function do not run in  parallel  ( you can see it by profile your query )

    2) The fact you "order by desc" required to process the entire data set (so if you have large result set ,it can take time ...)  ,  group by pipeline is not helpful for this case .

     

    Thanks 

  • Hi,

     

      This is the query which I am trying to tune.

    SELECT DIMENSION,AGG
    SELECT DIMENSION,AGG, ROW_NUM() OVER ( ORDER BY AGG DESC) RNK
    (SELECT DIMENSION, SUM(MEASURE) AS AGG FROM TABLE WHERE FILTER1 AND FILTTER2 AND FILTER3 GROUP BY DIMENSION) AS A
    WHERE A.AGG<>0) AS B WHERE B.RNK<=50;

     

    I agree, could you please suggest me a way to tune this query through projection.

  • Hi  David 

     

    First avoid the used of ROW_NUMBER functions , and modify you query to looks like below query :

     

    SELECT DIMENSION, SUM(MEASURE) AS AGG
    FROM TABLE
    WHERE FILTER1
    AND FILTTER2
    AND FILTER3
    GROUP BY DIMENSION HAVING SUM(MEASURE) <> 0 order by 2 desc limit 50

    Make sure your projection order columns include the filters + the group by column ( if you vae hard time with that you can used DBD )

     

    Hope you will find it useful

     

    Thanks 

     

     

  • Hi,

     

       Thanks for the your input.

     

       There is a challenge here, we cannot modify this query as do as a conventional step in tuning. This OLAP function is generated by service layer which runs on top of vertica to publish to dashboard. So, modifying existing query cannot be done unfortunately.

     

       Also, we dont have DBD here to take a lead on performance tuning. We use our best knowledge to tune the query.

     

      is there any other way you could suggest for me here. 

     

    Thanks.

  • Hi
    DBD is part of vertica SW distribution , do you any kind of limitation to execute it ?

    Thanks
  • Yes, We have limitation. We are not allowed to use it.

  • Hi

    Vertica expose API's to the DBD , this should simplify  things for you , you can Call the API's from your code . Using this method you will need to access directly the vertica server .

     

    I hope this could help

     

    Thanks 

     

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file