Options

Order by clause in Views

I created a view against a table using the following format "select col1, count(col2), count(distinct col3) from table group by col1, order by col1 desc". But when I do a select * from view_name, it seems that the results are not sorted according to my view definition. They should be sorted in descending order by the first column. Does Vertica not respect the order by clause in views? If no then why are my results not coming out correctly? and If yes, why does it have an example of a query in the SQL Documentation that has an order by clause? Edit: I am running on version 6.1.3.

Comments

  • Options
    Can you post your explain plan here - when you query you create view ! 
  • Options
    Can you post your explain plan here - when you query you create view ! 
  • Options
    See example:
    I have a table called Rio:    dbadmin=> select * from rio;        city
    -----------------
     cabo frio
     copacabana
     rio de janreiro
    (3 rows)
    Create the view with the order clause and query the view:
    dbadmin=> create view rio_desc as select count(city),city from rio group by 2 order by 2 desc;  CREATE VIEW
    dbadmin=> select * from rio_desc ;
     count |      city
    -------+-----------------
         1 | rio de janreiro
         1 | copacabana
         1 | cabo frio
    (3 rows)
    -- see it returns me following the order by rule

    Take a look at the plan to see the order by step
    explain select * from rio_desc ;
     Access Path:
     +-SORT [Cost: 7, Rows: 3 (NO STATISTICS)] (PATH ID: 2)
     |  Order: rio.city DESC
     | +---> GROUPBY PIPELINED [Cost: 6, Rows: 3 (NO STATISTICS)] (PATH ID: 3)
     | |      Aggregates: count(rio.city)
     | |      Group By: rio.city
     | | +---> STORAGE ACCESS for rio [Cost: 5, Rows: 3 (NO STATISTICS)] (PATH ID: 4)
     | | |      Projection: public.rio_super
     | | |      Materialize: rio.city

  • Options
    Hi Adrian, 

    Thanks for the prompt reply. I do see the order by clause in my explain plan. I should have added this yesterday in my question yesterday. My results set is kinda odd. It gives the first few records in the right descending order but then it starts over again. It seems like its doing the order by in stages. ex. posted below:


    Please ignore the outer select * from. I was trying something different. 

    create view prodrush.v_session_count_Test as select * from (select dim_Date_gid, count(*), count(distinct dim_user_Gid) session_Actual from prodrush.core_Session_STart group by dim_Date_Gid ) a order by a.dim_Date_gid desc;


    explain select * from prodrush.v_session_count_Test;

     Access Path:
     +-SORT [Cost: 75K, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
     |  Order: a.dim_Date_gid DESC
     |  Execute on: All Nodes
     |  Execute on: All Nodes
     | +---> GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 75K, Rows: 10K (NO STATISTICS)] (PATH ID: 4)
     | |      Aggregates: count(DISTINCT CORE_SESSION_START.DIM_USER_GID), sum_of_count(*)
     | |      Group By: CORE_SESSION_START.DIM_DATE_GID
     | |      Execute on: All Nodes
     | | +---> GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 75K, Rows: 10K (NO STATISTICS)] (PATH ID: 5)
     | | |      Aggregates: count(*)
     | | |      Group By: CORE_SESSION_START.DIM_DATE_GID, CORE_SESSION_START.DIM_USER_GID
     | | |      Execute on: All Nodes
     | | | +---> STORAGE ACCESS for CORE_SESSION_START [Cost: 10K, Rows: 47M (NO STATISTICS)] (PATH ID: 6)
     | | | |      Projection: prodrush.CORE_SESSION_START_b0
     | | | |      Materialize: CORE_SESSION_START.DIM_USER_GID, CORE_SESSION_START.DIM_DATE_GID
     | | | |      Execute on: All Nodes



    dim_Date_gid | count  | session_Actual
    --------------+--------+----------------
         20140506 | 430248 |         136415
         20140428 | 442677 |         148317
         20140425 | 443542 |         150815
         20140421 | 474549 |         158368
         20140411 | 447423 |         142000
         20140408 | 477893 |         148368
         20140407 | 484159 |         149017
         20140326 | 553793 |         179314
         20140313 | 355971 |         102434
         20140306 | 363223 |         101124
         20140302 | 346030 |         100294
         20140227 | 340832 |          98949
         20140226 | 345988 |          98747
         20140223 | 358341 |         100646
         20140212 | 357566 |         101414
         20140202 | 512550 |         206070
         20140201 | 512621 |         238803
         20140131 | 318714 |         160801
         20140122 | 118070 |          36597
         20140114 | 112691 |          35164
         20140108 | 109384 |          36227
         20140104 | 107475 |          35681
         20140103 | 103139 |          34757
         20131231 |  81115 |          26936
         20131226 |  78721 |          26897
         20131129 |  35680 |          12101
         20131123 |  35266 |          11737
         20131119 |  35381 |          11012
         20131115 |  33234 |          10068
         20131110 |  26475 |           7741
         20131107 |  21335 |           5960
         20131102 |  20599 |           5655
         20131015 |      9 |              2
                  |      2 |              0
         20140508 | 424302 |         135379
         20140426 | 456872 |         157898
         20140409 | 467122 |         146348
         20140403 | 498568 |         155057
         20140330 | 528140 |         165537
         20140323 | 390879 |         129574
         20140317 | 336300 |          99432
         20140315 | 335889 |          98972
         20140314 | 336840 |          98915
         20140303 | 343591 |          99907
         20140224 | 351225 |          99135
         20140214 | 342582 |         100042
         20140210 | 359154 |         102985
         20140125 | 131294 |          40865
         20140102 |  99228 |          33244
         20131204 |  37285 |          12617
         20131201 |  36773 |          12397
         20131127 |  37755 |          12612
         20131125 |  35714 |          11826
         20131124 |  36710 |          11830
         20131118 |  34857 |          10492
         20131031 |    171 |             37




  • Options
    Post the explain plan! of you query plz
  • Options
    I did. But it might be around the time I edited my post. 
  • Options
    It looks ok to me ! 
    One thing - NO STATISTICS - run refresh on the used projections.
  • Options
    I ran analyze_stats on the entire table. I am a bit confused. How are the results OK? The first record has 6th May. If you scrill down. You will see 8th may. the entry for 8th May should be the first record right?
  • Options
    -sorry i was talking about the plan output and not the result set ! 
    True that that the result set output is kind strange - i have no exp for that ! 
  • Options
    Anyone at Vertica know about this? Adam?
  • Options
    I guess I'm famous now?

    Hm...  No, I don't know offhand.  I could hazard a guess.  Really, though, I think Vertica assumes that the output of any SELECT statement need not be ordered unless it has an explicit ORDER BY clause on the outermost SQL statement itself.  (Not in a subquery, view, etc -- maybe the data was ordered coming out of the subquery, but there are cases where it's advantageous to give up on keeping track of the ordering as part of the next step of the query.)

    In general -- you can put an ORDER BY on a table but you can't assume that SELECT will fetch the data in exactly that order (though imposing the order on the data will be very cheap in this case); I would expect the same for views.

    Do you have a pointer to the bit of the documentation that suggests putting an ORDER BY in the view?

    Thanks,
    Adam
  • Options
    Hi Karan,
    Very interesting find. I was able to replicate your behavior. When I compare the explain plan of the view and the underlying SELECT statement used in the view, the  (SORT OUTPUT) step is missing in the view's plan. Looks like Vertica is not sorting your data at all!

    Querying the VIEWS system table for the view definition still shows the order by clause. Even the example in the Vertica documentation has an Order By clause for Create View.

    Adam,
    the relevant link, including the Order By example is at https://my.vertica.com/docs/6.1.x/HTML/index.htm#10810.htm

    /Sajan
  • Options
    Can you post your used code i am really interested in seeing this on my box.
  • Options
    I used a simple select distinct statement, as seen in the snapshot below. The distinct is actually optional, I used it only to track the order of the data easily.
    As you can see, there a SORT OUTPUT step in the plan. When creating view with the same select statement, the explain plan doesn't contain the SORT OUTPUT step or anything equivalent.

    I am running this in a 3 node cluster and the projection used is segmented. You may not be able to replicate the issue in a single node instance or a replicated projection, although I havent tried that yet.

    image
  • Options
    OK, I might be that because i have tested in a single node installation i didn't experience the same behavior. Thx for the example
  • Options
    Could it be that the sort is only being done on a per ROS container basis?
  • Options
    That appears to be the case. Is this a potential bug?

  • Options
    Anyone?
  • Options
    Hi Karan,

    Folks here are still taking a look.  But the existing answer is, as I understand it, the SQL standard doesn't require that views be sorted, even if the SQL inside the view definition contains an ORDER BY clause.  And I don't think we intend to promise anything stronger here than is guaranteed by the standard.  (Views, regardless of their definition, are like tables:  They are a collection of rows; even in Vertica, where we keep our data sorted on disk, the order when SELECT'ing from either a view or a table is unspecified.)

    This would seem to imply that the documentation page that you cited is, at best, misleading.  It doesn't actually state that the sorted output will always be sorted because an ORDER BY clause is used there.  But it would be entirely silly to use an ORDER BY there otherwise...

    I would expect (though can't promise here) that this will be addressed, by either a new-feature announcement or a documentation change, in a future release.

    Adam
  • Options
    Well I am OK either way as long as the system works according to the documentation. 

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file