How do I check the last updated time stamp on tables in my vertica cluster?

Hi,

 

I am looking at a a vertica database of about 80 tables, most of which are not used. I want to figure out which ones have not been used in a while so I can rename and delete them. Is there a script I can run to figure out when all tables in the database were last updated?

 

Thanks.

 

C

Comments

  • Hi , 

     

    This a sql i use to look at unused projections.(this will show all the projections that have quried at a point).

     - remove unsed projections since they will slowdown loading and refreshing, the TM will do more work then necessary. 

     SELECT p.table_schema,
    p.basename AS table_name,
    max(pu.last_used) AS last_used,
    date_part('day'::varchar(3), (((statement_timestamp())::timestamp - max(pu.last_used)))) AS days_last_used,
    round(sum((sc.used_bytes / 1048576::float)), 3) AS used_mb,
    round((sum(sc.used_bytes) / 1073741824::float), 3) AS used_gb
    FROM ((( SELECT dc_projections_used.projection_oid,
    max(dc_projections_used."time") AS last_used
    FROM v_internal.dc_projections_used
    WHERE (dc_projections_used.table_oid IN ( SELECT tables.table_id
    FROM v_catalog.tables
    WHERE (NOT tables.is_system_table)))
    GROUP BY dc_projections_used.projection_oid) pu JOIN ( SELECT vs_projections.oid,
    vs_projections.name,
    vs_projections.basename,
    vs_projections.schemaname AS table_schema
    FROM v_internal.vs_projections) p ON ((p.oid = pu.projection_oid))) JOIN ( SELECT storage_containers.projection_id,
    storage_containers.used_bytes
    FROM v_monitor.storage_containers) sc ON ((sc.projection_id = pu.projection_oid)))
    GROUP BY p.table_schema,
    p.basename
    ORDER BY max(pu.last_used);

     You also might have empty tables (only the DDL is there on projections for them).

    - remove them since is creating to much confusion/hard to manage.

    select * from tables where table_id not in (
    select anchor_table_id from projection_storage);

    See if you have views pointing to this tables.

    select table_schema||'.'||table_name from views where view_definition ilike '%schema.table_name%';

     

    Also you might have projections that have never been used:

    select * from projections where projection_name
    not in (select distinct projection_name from projection_usage);

    Note:

    - this will depend on the retention policy on the dc_projections_used table.

    -this query will tell you how long back can you look for query requests info.

    select min(time),max(time) from dc_projections_used;

    - this can be increased also but you need to considerthe space usage of the table(remember all DC table are stored on disk, so quering this table is perf killer).

     

  •  Ohh almost forgot !!! 

    This is something that might trasah your cleanup strategy as did with mine a while back.

     

      If you have a table that was created , populated and queried a while back , let`s say 60 days. This table should be considered for cleanup ! right ! 

     But as a good DBA you want to have statistics for your table(all of them), so when you run statistics on this table guess what happens ? the table wil no longer appear as unused, it looks like the analyze_stat will 'query' the table and it will appaer in the used_projections table.

     This can be misleading when you try to identify the real unused tables. 

     

    So as tactict to avoid this, now i query the table dc_projection_used table and run my stats only on used table.

     select distinct'select analyze_statistics('''||table_schema||'.'||table_name||''');' from dc_projections_used where is_virtual='false';

     

     

  • Hi. Thanks for the reply. The queries you provided preved very helpful.

     

    Is there a way to see the when the tables themselves were last modified (INSERT/UPDATE)?

     

    Thanks.

     

    C

  • Does anyone have solution for this?

     

  • You can use the PROJECTION_USAGE system table, io_type = 'output'.  Review the current effective retention policy for the corresponding Data Collector tables since those tables control the timeframe covered by the system table.

     

      --Sharon

     

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file