We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


How to get counts in milliseconds ? — Vertica Forum

How to get counts in milliseconds ?

edited January 2018 in General Discussion

Hello,
I have 50 crores of data with 500 data points (columns), I have to server counts to 1000 subscribers at a time, what I have to do to get counts in milliseconds?
Thanks in advance

Best Answer

  • Answer ✓

    I have 1 crore data in my vertica db::

    user_testing=> select count(*) from likes_test;
      count   
    ----------
     10000001
    (1 row)
    
    Time: First fetch (1 row): 282.163 ms. All rows formatted: 282.187 ms
    

    The data contains like this

    [
    {"id":1,
    "likes":[401,218,134,941,565,457,864,346,774,921,766,432,815,63,822,202,679,409,457,743,902,199,717,232,172,958,179,808,271,418,145,949,806,399,68,935,423,287,31,365,68,875,347,786,386,792,620,903,136,67,539,196,866,419,390,612,591,881,68,575,226,291,354,380,889,105,412,511,411,861,61,802,388,168,655,296,43,541,824,756,649,2,204,396,231,499,42,773,918,19,229,716,675,762,939,352,35,949,445,787,215,524,998,596,267,444,690,52,762]}
    ]

    When i query on likes it takes 29 seconds

    user_testing=> select count(id) from likes_test where mapcontainsvalue(likes,'2')='t' or mapcontainsvalue(likes,'3');
      count  
    ---------
     1967925
    (1 row)
    
    Time: First fetch (1 row): 29112.269 ms. All rows formatted: 29112.297 ms
    

    Then I decreased interval of AnalyzeRowCount service.

       user_testing=> ALTER DATABASE db_testing_0 SET AnalyzeRowCountInterval=0;
    ALTER DATABASE
    Time: First fetch (0 rows): 209.445 ms. All rows formatted: 209.456 ms
    

    Again I tried same query it takes 26 seconds

        user_testing=> select count(id) from likes_test where mapcontainsvalue(likes,'2')='t' or mapcontainsvalue(likes,'3');
          count  
        ---------
         1967925
        (1 row)
    
        Time: First fetch (1 row): 26893.240 ms. All rows formatted: 26893.264 ms
    

    Then I created projection on it but result is same :/:/

    user_testing=> select count(id) from likes_test_p where mapcontainsvalue(likes,'2')='t' or mapcontainsvalue(likes,'4');
      count  
    ---------
     1968654
    (1 row)
    
    Time: First fetch (1 row): 27286.438 ms. All rows formatted: 27286.462 ms
    user_testing=> 
    

    What I have to do to get counts in less time, provide solution :(

Answers

  • @Jim_Knicely I have to serve** counts to 1000 subscribers at a time

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Hi,

    What do you mean that you have to provide counts? Do you mean a table row count? Getting a row count is normally pretty quick in Vertica. But maybe you can speed things up a bit with a Live Aggregate Projection?

    dbadmin=> select count(*) from public.big_table2;
      count
    ----------
     50000000
    (1 row)
    
    Time: First fetch (1 row): 689.151 ms. All rows formatted: 689.192 ms
    
    dbadmin=> create projection public.big_table2_count_pr as select 1 gb, count(pk) rc from public.big_table2 group by 1;
    WARNING 5993:  Projection is irregularly segmented by column
    HINT:  Consider using a segmentation expression, such as SEGMENTED BY HASH(column)
    WARNING 6852:  Live Aggregate Projection "big_table2_count_pr" will be created for "big_table2". Data in "big_table2" will be neither updated nor deleted
    WARNING 4468:  Projection <public.big_table2_count_pr> is not available for query processing. Execute the select start_refresh() function to copy data into this projection.
              The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
    CREATE PROJECTION
    Time: First fetch (0 rows): 11.914 ms. All rows formatted: 11.932 ms
    
    dbadmin=> select refresh('public.big_table2');
                                                                                                                                                       refresh
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Refresh completed with the following outcomes:
    Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
    ----------------------------------------------------------------------------------------
    "public"."big_table2_count_pr": [big_table2] [refreshed] [scratch] [0] [1]
    
    (1 row)
    
    dbadmin=> select rc from public.big_table2_count_pr;
        rc
    ----------
     50000000
    (1 row)
    
    Time: First fetch (1 row): 7.612 ms. All rows formatted: 7.662 ms
    
  • But our table getting updated continuously, in this case, How can I get counts?

  • Hi @ashu123!

    Try to decrease interval of AnalyzeRowCount service.

    dbadmin=> select list_services('TM');
    
                                     list_services
    
    --------------------------------------------------------------------------------
    
     Service: 'AnalyzeRowCount' is enabled , interval 60 second(s)
     ...
    

    Docs

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file