Be Careful with the Sequence CACHE Value

[Deleted User][Deleted User] Administrator
edited June 2018 in Tips from the Team

Jim Knicely authored this tip.

The default session cache for a sequence is 250,000. Although you can change the cache value of a sequence, setting the value too low can adversely affect performance!

Example:

dbadmin=> SELECT COUNT(*) FROM ten_thousand_records;
COUNT
-------
10000
(1 row)

dbadmin=> CREATE SEQUENCE default_cache;
CREATE SEQUENCE

dbadmin=> CREATE SEQUENCE non_default_cache CACHE 5;
CREATE SEQUENCE

dbadmin=> \timing on
Timing is on.

dbadmin=> CREATE TABLE t1 AS SELECT default_cache.nextval FROM ten_thousand_records;
CREATE TABLE
Time: First fetch (0 rows): 32.996 ms. All rows formatted: 33.013 ms

dbadmin=> CREATE TABLE t2 AS SELECT non_default_cache.nextval FROM ten_thousand_records;
CREATE TABLE
Time: First fetch (0 rows): 45314.645 ms. All rows formatted: 45314.663 ms

Note that it took 45 seconds to generate 10,000 new sequence values on a 3-node Vertica cluster where sequence cache was 5, but << 1 second when the cache was the default value.

Have fun!

Comments

  • sreeblrsreeblr - Select Field - Employee

    i saw earlier post by Jim as well suggesting cache size 500. if we have typical case of 100000 rows with data sometimes going up to million what should be ideal cache size.

  • I wrote a query long ago to do that. Use your judgment here - this query only looks at the size of the projection, and assumes that loads are of similar size. If that's not true, you might want to adjust.

    Generally speaking, if you're doing batch loads of millions of rows, the default cache size will be too low. It's quite reasonable to bump the cache size up to 1M or even 2M, depending on what you're doing. Likewise, if you're loading 100 rows into a table during a load, a default cache size of 250k is kind of ridiculous. In those cases, a much smaller size makes more sense.

    Also note that Vertica fixed some of the catalog locking issues that we saw in 7.x in 8.1 (I think), so cache sizes aren't as big an issue as they might have once been. Still, you can see some performance increases by having appropriately sized caches.

    select sequence, cache, increment_by as inc, to_char(projection_size, '999,999,999,999'), to_char((cache/increment_by)/projection_size, '999,999.99999') as ratio
    , case when (cache/increment_by)/projection_size > 5000.0 then cache / 5000
    when (cache/increment_by)/projection_size between 1000.0 and 4999.0 then cache / 2500
    when (cache/increment_by)/projection_size between 10.0 and 999.0 then cache / 500
    when (cache/increment_by)/projection_size between 0.01 and 0.1 then cache * 10
    when (cache/increment_by)/projection_size between 0.001 and 0.09 then cache * 25
    when (cache/increment_by)/projection_size between 0.0001 and 0.009 then cache * 50
    when (cache/increment_by)/projection_size between 0.00001 and 0.0009 then cache * 100
    when (cache/increment_by)/projection_size between 0.000001 and 0.00009 then cache * 500
    when (cache/increment_by)/projection_size < 0.000001 then 5000000 end as recommendation
    from
    (
    SELECT sequence_schema || '.' || sequence_name as sequence, session_cache_count AS cache, increment_by
    ,CASE WHEN p.is_segmented THEN SUM(ros_row_count)
    ELSE MAX(ros_row_count) END AS projection_size
    FROM sequences s
    JOIN projection_storage ps ON s.identity_table_id = ps.anchor_table_id
    JOIN projections p ON s.identity_table_id = p.anchor_table_id
    where ps.ros_row_count > 20
    GROUP BY 1,2,3,p.is_segmented ) foo
    where (((cache/increment_by)/projection_size < 0.1)
    or ((cache/increment_by)/projection_size > 10.0))
    order by 2, 1;

  • sreeblrsreeblr - Select Field - Employee

    Ran the query but didnt return any rows

  • sreeblrsreeblr - Select Field - Employee

    also we dont have identity columns but normal sequences can you help modifyng query to run without identity columns

Sign In or Register to comment.