Keep Data Collector Information for a Set Interval

The Data Collector retains history of important system activities and records essential performance and resource utilization counters. You probably know that you can set a size restraint (memory and disk space in kilobytes) for the specified Data Collector table on all nodes.

But if you are looking to keep records for some period of time, i.e. the last 6 days, it would be very difficult to calculate how much memory or disk space is needed.

Good news is you don’t have to worry about that! Simply set a “time capacity” for individual Data Collector tables on all nodes using the SET_DATA_COLLECTOR_TIME_POLICY function!

Example:

dbadmin->mydb@sandbox1=> SELECT node_name, component, description, interval_set, interval_time
dbadmin->mydb@sandbox1-->   FROM data_collector
dbadmin->mydb@sandbox1-->  WHERE table_name = 'dc_requests_issued'
dbadmin->mydb@sandbox1-->  ORDER BY 1;
    node_name    |   component    |            description             | interval_set | interval_time
-----------------+----------------+------------------------------------+--------------+---------------
v_mydb_node0001 | RequestsIssued | History of all SQL requests issued | f            | 0
v_mydb_node0002 | RequestsIssued | History of all SQL requests issued | f            | 0
v_mydb_node0003 | RequestsIssued | History of all SQL requests issued | f            | 0
(3 rows)

dbadmin->mydb@sandbox1=>* SELECT set_data_collector_time_policy('RequestsIssued', '6 days'::interval);
set_data_collector_time_policy
--------------------------------
SET
(1 row)

dbadmin->mydb@sandbox1=> SELECT node_name, component, description, interval_set, interval_time
dbadmin->mydb@sandbox1-->   FROM data_collector
dbadmin->mydb@sandbox1-->  WHERE table_name = 'dc_requests_issued'
dbadmin->mydb@sandbox1-->  ORDER BY 1;
    node_name    |   component    |            description             | interval_set | interval_time
-----------------+----------------+------------------------------------+--------------+---------------
v_mydb_node0001 | RequestsIssued | History of all SQL requests issued | t            | 6
v_mydb_node0002 | RequestsIssued | History of all SQL requests issued | t            | 6
v_mydb_node0003 | RequestsIssued | History of all SQL requests issued | t            | 6
(3 rows)

Helpful Link:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/VerticaFunctions/DataCollection/SET_DATA_COLLECTOR_TIME_POLICY.htm

Have fun!

Comments

  • Hi Jim,

    Thanks for the tip,
    The documentation states that "If you specify interval, Vertica enforces the setting that is exceeded first (size or time). Before you include a time restraint, verify that the disk size capacity is sufficiently large." So to my understanding we still need to estimate how much disk is needed otherwise the data will purged...

    Chaima

  • Jim_KnicelyJim_Knicely Administrator
    edited January 2019

    Correct!

    What's interesting is that you could use time based retention with 0 disk space usage! You just need to set the in memory buffer.

    Example:

    dbadmin=> SELECT set_data_collector_policy('RequestsIssued', '-1', '-1');
     set_data_collector_policy
    ---------------------------
     SET
    (1 row)
    
    dbadmin=> SELECT get_data_collector_policy('RequestsIssued');
     get_data_collector_policy
    ---------------------------
     Not kept.
    (1 row)
    
    dbadmin=> SELECT set_data_collector_time_policy('RequestsIssued', '3 DAYS'::interval);
    WARNING 6430:  Time-based retention will not occur unless a memory buffer is also requested
    HINT:  Specify a positive value for memory retention
     set_data_collector_time_policy
    --------------------------------
     SET
    (1 row)
    
    dbadmin=> SELECT set_data_collector_policy('RequestsIssued', '1500', '-1');
     set_data_collector_policy
    ---------------------------
     SET
    (1 row)
    
    dbadmin=> SELECT set_data_collector_time_policy('RequestsIssued', '3 DAYS'::interval);
     set_data_collector_time_policy
    --------------------------------
     SET
    (1 row)
    
    dbadmin=> SELECT component, table_name, current_memory_bytes, current_disk_bytes, interval_set, interval_time
    dbadmin->   FROM data_collector
    dbadmin->  WHERE component = 'RequestsIssued';
       component    |     table_name     | current_memory_bytes | current_disk_bytes | interval_set | interval_time
    ----------------+--------------------+----------------------+--------------------+--------------+---------------
     RequestsIssued | dc_requests_issued |                 8606 |                  0 | t            | 3
    (1 row)
    
    dbadmin=> SELECT COUNT(*) FROM dc_requests_issued;
     COUNT
    -------
        21
    (1 row)
    
  • chaimachaima Employee

    Fantastic Jim!
    Thanks a lot for the update :smile:

Sign In or Register to comment.