How to get counts in milliseconds ?

edited January 2018 in General Discussion

Hello,
I have 50 crores of data with 500 data points (columns), I have to server counts to 1000 subscribers at a time, what I have to do to get counts in milliseconds?
Thanks in advance

Best Answer

  • Answer ✓

    I have 1 crore data in my vertica db::

    user_testing=> select count(*) from likes_test;
      count   
    ----------
     10000001
    (1 row)
    
    Time: First fetch (1 row): 282.163 ms. All rows formatted: 282.187 ms
    

    The data contains like this

    [
    {"id":1,
    "likes":[401,218,134,941,565,457,864,346,774,921,766,432,815,63,822,202,679,409,457,743,902,199,717,232,172,958,179,808,271,418,145,949,806,399,68,935,423,287,31,365,68,875,347,786,386,792,620,903,136,67,539,196,866,419,390,612,591,881,68,575,226,291,354,380,889,105,412,511,411,861,61,802,388,168,655,296,43,541,824,756,649,2,204,396,231,499,42,773,918,19,229,716,675,762,939,352,35,949,445,787,215,524,998,596,267,444,690,52,762]}
    ]

    When i query on likes it takes 29 seconds

    user_testing=> select count(id) from likes_test where mapcontainsvalue(likes,'2')='t' or mapcontainsvalue(likes,'3');
      count  
    ---------
     1967925
    (1 row)
    
    Time: First fetch (1 row): 29112.269 ms. All rows formatted: 29112.297 ms
    

    Then I decreased interval of AnalyzeRowCount service.

       user_testing=> ALTER DATABASE db_testing_0 SET AnalyzeRowCountInterval=0;
    ALTER DATABASE
    Time: First fetch (0 rows): 209.445 ms. All rows formatted: 209.456 ms
    

    Again I tried same query it takes 26 seconds

        user_testing=> select count(id) from likes_test where mapcontainsvalue(likes,'2')='t' or mapcontainsvalue(likes,'3');
          count  
        ---------
         1967925
        (1 row)
    
        Time: First fetch (1 row): 26893.240 ms. All rows formatted: 26893.264 ms
    

    Then I created projection on it but result is same :/:/

    user_testing=> select count(id) from likes_test_p where mapcontainsvalue(likes,'2')='t' or mapcontainsvalue(likes,'4');
      count  
    ---------
     1968654
    (1 row)
    
    Time: First fetch (1 row): 27286.438 ms. All rows formatted: 27286.462 ms
    user_testing=> 
    

    What I have to do to get counts in less time, provide solution :(

Answers

  • @Jim_Knicely I have to serve** counts to 1000 subscribers at a time

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Hi,

    What do you mean that you have to provide counts? Do you mean a table row count? Getting a row count is normally pretty quick in Vertica. But maybe you can speed things up a bit with a Live Aggregate Projection?

    dbadmin=> select count(*) from public.big_table2;
      count
    ----------
     50000000
    (1 row)
    
    Time: First fetch (1 row): 689.151 ms. All rows formatted: 689.192 ms
    
    dbadmin=> create projection public.big_table2_count_pr as select 1 gb, count(pk) rc from public.big_table2 group by 1;
    WARNING 5993:  Projection is irregularly segmented by column
    HINT:  Consider using a segmentation expression, such as SEGMENTED BY HASH(column)
    WARNING 6852:  Live Aggregate Projection "big_table2_count_pr" will be created for "big_table2". Data in "big_table2" will be neither updated nor deleted
    WARNING 4468:  Projection <public.big_table2_count_pr> is not available for query processing. Execute the select start_refresh() function to copy data into this projection.
              The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
    CREATE PROJECTION
    Time: First fetch (0 rows): 11.914 ms. All rows formatted: 11.932 ms
    
    dbadmin=> select refresh('public.big_table2');
                                                                                                                                                       refresh
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Refresh completed with the following outcomes:
    Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
    ----------------------------------------------------------------------------------------
    "public"."big_table2_count_pr": [big_table2] [refreshed] [scratch] [0] [1]
    
    (1 row)
    
    dbadmin=> select rc from public.big_table2_count_pr;
        rc
    ----------
     50000000
    (1 row)
    
    Time: First fetch (1 row): 7.612 ms. All rows formatted: 7.662 ms
    
  • But our table getting updated continuously, in this case, How can I get counts?

  • Hi @ashu123!

    Try to decrease interval of AnalyzeRowCount service.

    dbadmin=> select list_services('TM');
    
                                     list_services
    
    --------------------------------------------------------------------------------
    
     Service: 'AnalyzeRowCount' is enabled , interval 60 second(s)
     ...
    

    Docs

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file