Options

Statistics for the epoch column

yamakawayamakawa Vertica Customer

Thank you for all your help.

Premise

  • Currently we are loading Vertica tables using the COPY command.
    Vertica tables are reference only and not updated.
    (Deletion by truncate may be performed)

  • After the load process, use select analyze_statistics('public.table_name', 10);.

  • The WLA is automatically executed at 18:00 every day.

The phenomenon

When I checked today after my vacation, the following messages were output to Workload Analyzer.

analyze statistics on table column public.table_name.column_name

When I checked the details, I found that most of them were for the epoch column.

The statistics on table column public.table_name.column_name
Is it possible that the statistics of the epoch column are not retrieved by simply retrieving the statistics when the table is loaded?

I will try to retrieve the statistics again according to the Management Console.
If you know the cause of the output, please reply.

Answers

  • Options
    Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited January 2021

    Pretty sure that Vertica does not collect stats on the Epoch columns in projections.

    The WLA rule is simply querying the dc_optimizer_events for PREDICATE OUTSIDE HISTOGRAM anf NO HISTOGRAM events.

    For it to report that you need analyze stats on the Epoch column, somone must be using the EPOCH column as a predicate?

    Example:

    dbadmin=> select time,
    dbadmin->        event_type as observation_description,
    dbadmin->        p.anchortable as table_id,
    dbadmin->        transaction_id, statement_id,
    dbadmin->        schemaname || '.' || anchortablename || '.' || pc.name as tuning_parameter,
    dbadmin->        'analyze statistics on table column '
    dbadmin->            || schemaname || '.' || anchortablename || '.' || pc.name as tuning_description,
    dbadmin->        'select ' || suggested_action as tuning_command,
    dbadmin->        'MEDIUM' as tuning_cost
    dbadmin-> from dc_optimizer_events dcoe, vs_projection_columns pc, vs_projections p
    dbadmin-> where (event_type = 'PREDICATE OUTSIDE HISTOGRAM' or event_type = 'NO HISTOGRAM')
    dbadmin->       and event_oid = pc.oid and pc.proj=p.oid
    dbadmin->       and not exists (select end_time from dc_analyze_statistics
    dbadmin(>                       where event_oid = proj_column_oid and end_time > dcoe.time);
     time | observation_description | table_id | transaction_id | statement_id | tuning_parameter | tuning_description | tuning_command | tuning_cost
    ------+-------------------------+----------+----------------+--------------+------------------+--------------------+----------------+-------------
    (0 rows)
    
    dbadmin=> CREATE TABLE test_tuning (c1 INT);
    CREATE TABLE
    
    dbadmin=> INSERT INTO test_tuning SELECT 1;
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> SELECT analyze_statistics('test_tuning');
     analyze_statistics
    --------------------
                      0
    (1 row)
    
    dbadmin=> SELECT * FROM test_tuning WHERE c1 = 2;
     c1
    ----
    (0 rows)
    
    dbadmin=> select --time,
    dbadmin->        event_type as observation_description,
    dbadmin->        --p.anchortable as table_id,
    dbadmin->        --transaction_id, statement_id,
    dbadmin->        --schemaname || '.' || anchortablename || '.' || pc.name as tuning_parameter,
    dbadmin->        --'analyze statistics on table column '
    dbadmin->        --    || schemaname || '.' || anchortablename || '.' || pc.name as tuning_description,
    dbadmin->        'select ' || suggested_action as tuning_command,
    dbadmin->        'MEDIUM' as tuning_cost
    dbadmin-> from dc_optimizer_events dcoe, vs_projection_columns pc, vs_projections p
    dbadmin-> where (event_type = 'PREDICATE OUTSIDE HISTOGRAM' or event_type = 'NO HISTOGRAM')
    dbadmin->       and event_oid = pc.oid and pc.proj=p.oid
    dbadmin->       and not exists (select end_time from dc_analyze_statistics
    dbadmin(>                       where event_oid = proj_column_oid and end_time > dcoe.time);
       observation_description   |                   tuning_command                    | tuning_cost
    -----------------------------+-----------------------------------------------------+-------------
     PREDICATE OUTSIDE HISTOGRAM | select analyze_statistics('public.test_tuning.c1'); | MEDIUM
    (1 row)
    
    dbadmin=> SELECT epoch, c1 FROM test_tuning;
     epoch | c1
    -------+----
      2309 |  1
    (1 row)
    
    dbadmin=> SELECT * FROM test_tuning WHERE epoch = 2;
     c1
    ----
    (0 rows)
    
    dbadmin=> select --time,
    dbadmin->        event_type as observation_description,
    dbadmin->        --p.anchortable as table_id,
    dbadmin->        --transaction_id, statement_id,
    dbadmin->        --schemaname || '.' || anchortablename || '.' || pc.name as tuning_parameter,
    dbadmin->        --'analyze statistics on table column '
    dbadmin->        --    || schemaname || '.' || anchortablename || '.' || pc.name as tuning_description,
    dbadmin->        'select ' || suggested_action as tuning_command,
    dbadmin->        'MEDIUM' as tuning_cost
    dbadmin-> from dc_optimizer_events dcoe, vs_projection_columns pc, vs_projections p
    dbadmin-> where (event_type = 'PREDICATE OUTSIDE HISTOGRAM' or event_type = 'NO HISTOGRAM')
    dbadmin->       and event_oid = pc.oid and pc.proj=p.oid
    dbadmin->       and not exists (select end_time from dc_analyze_statistics
    dbadmin(>                       where event_oid = proj_column_oid and end_time > dcoe.time);
       observation_description   |                     tuning_command                     | tuning_cost
    -----------------------------+--------------------------------------------------------+-------------
     PREDICATE OUTSIDE HISTOGRAM | select analyze_statistics('public.test_tuning.c1');    | MEDIUM
     NO HISTOGRAM                | select analyze_statistics('public.test_tuning.epoch'); | MEDIUM
     PREDICATE OUTSIDE HISTOGRAM | select analyze_statistics('public.test_tuning.epoch'); | MEDIUM
    (3 rows)
    
  • Options
    yamakawayamakawa Vertica Customer
    edited January 2021

    Thanks for the reply.

    I double-checked the process, but there is no part where the EPOCH column is explicitly written as a predicate.

    I compared the time in dc_optimizer_events.time and vertica.log, and the process before and after the time is as follows.

    1. truncate table mytable;
    2. copy mytable from stdin delimiter U&'\0001' record terminator E'[EOL]\n' no escape abort on error;
      begin txn
    3. time at dc_optimizer_events.time
    4. copy mytable from stdin delimiter U&'\0001' record terminator E'[EOL]\n' no escape abort on error;
      Starting commit
    5. select get_num_accepted_rows();
    6. select analyze_statistics('mytable', 10);

    Is there any process that refers to the EPOCH column internally?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file