System table - DISK_STORAGE

Tim_1Tim_1 Registered User

Hi,
Does anyone know how the "throughput" and "latency" columns get populated in the DISK_STORAGE table?
select * from disk_storage
where storage_usage ='DATA,TEMP'
and node_name = 'v_marketdata_node0002';

v_marketdata_node0002 /apps/vertica/Data/MarketData/v_marketdata_node0002_data DATA,TEMP 0 0 0 Active 4096 399639126 1561090 393259564 1536170 49%

Still 0 after running:-
SELECT MEASURE_LOCATION_PERFORMANCE('/apps/vertica/Data/MarketData/v_marketdata_node0002_data' , 'v_marketdata_node0002');

Regards
Tim

Comments

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited June 14

    Hi,

    The DISK_STORAGE system table returns statistics on the file system, not Vertica's usage of it. Vertica won't populate the THROUGHPUT and LATENCY columns automatically. You have to do it manually with the SET_LOCATION_PERFORMANCE function!

    See:
    https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/AdministratorsGuide/StorageLocations/SettingLocationPerformance.htm

    Example:

    dbadmin=> SELECT node_name, storage_path, throughput, latency FROM disk_storage WHERE storage_usage = 'DATA,TEMP';
         node_name      |                 storage_path                  | throughput | latency
    --------------------+-----------------------------------------------+------------+---------
     v_test_db_node0001 | /home/dbadmin/test_db/v_test_db_node0001_data |          0 |       0
     v_test_db_node0002 | /home/dbadmin/test_db/v_test_db_node0002_data |          0 |       0
     v_test_db_node0003 | /home/dbadmin/test_db/v_test_db_node0003_data |          0 |       0
    (3 rows)
    
    dbadmin=> SELECT MEASURE_LOCATION_PERFORMANCE('/home/dbadmin/test_db/v_test_db_node0002_data', 'v_test_db_node0002');
    WARNING 3914:  measure_location_performance can take a long time to execute. Please check logs for progress
              MEASURE_LOCATION_PERFORMANCE
    ------------------------------------------------
     Throughput : 23 MB/sec. Latency : 46 seeks/sec
    (1 row)
    
    dbadmin=> SELECT node_name, storage_path, throughput, latency FROM disk_storage WHERE storage_usage = 'DATA,TEMP';
         node_name      |                 storage_path                  | throughput | latency
    --------------------+-----------------------------------------------+------------+---------
     v_test_db_node0001 | /home/dbadmin/test_db/v_test_db_node0001_data |          0 |       0
     v_test_db_node0002 | /home/dbadmin/test_db/v_test_db_node0002_data |          0 |       0
     v_test_db_node0003 | /home/dbadmin/test_db/v_test_db_node0003_data |          0 |       0
    (3 rows)
    
    dbadmin=> SELECT SET_LOCATION_PERFORMANCE('/home/dbadmin/test_db/v_test_db_node0002_data', 'v_test_db_node0002', '23', '46');
                               SET_LOCATION_PERFORMANCE
    ------------------------------------------------------------------------------
     /home/dbadmin/test_db/v_test_db_node0002_data performance statistics stored.
    (1 row)
    
    dbadmin=> SELECT node_name, storage_path, throughput, latency FROM disk_storage WHERE storage_usage = 'DATA,TEMP';
         node_name      |                 storage_path                  | throughput | latency
    --------------------+-----------------------------------------------+------------+---------
     v_test_db_node0001 | /home/dbadmin/test_db/v_test_db_node0001_data |          0 |       0
     v_test_db_node0002 | /home/dbadmin/test_db/v_test_db_node0002_data |         23 |      46
     v_test_db_node0003 | /home/dbadmin/test_db/v_test_db_node0003_data |          0 |       0
    (3 rows)
    

    Note that I am getting terrible throughput and latency values 'cause my example above was performed on a 3 node Vertica cluster running on my laptop :)

  • Tim_1Tim_1 Registered User

    Thanks, shame it does not automatically update the DISK_STORAGE table.

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited June 15

    Hi,

    I think having the MEASURE_LOCATION_PERFORMANCE automatically update the THROUGHPUT and LATENCY columns in the DISK_STORAGE system would be a bad idea. That is, what if the results of MEASURE_LOCATION_PERFORMANCE for a particular run are skewed because of some unknown disk activity on a particular storage location? You'd set the values to a value lower than typical. Prior to setting the values in DISK_STORAGE, I execute MEASURE_LOCATION_PERFORMANCE at various times over a 24 our period (via cron), then take the average of those runs.

    Thanks!

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file