We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Statistics for the epoch column — Vertica Forum

Statistics for the epoch column

yamakawayamakawa Vertica Customer

Thank you for all your help.

Premise

  • Currently we are loading Vertica tables using the COPY command.
    Vertica tables are reference only and not updated.
    (Deletion by truncate may be performed)

  • After the load process, use select analyze_statistics('public.table_name', 10);.

  • The WLA is automatically executed at 18:00 every day.

The phenomenon

When I checked today after my vacation, the following messages were output to Workload Analyzer.

analyze statistics on table column public.table_name.column_name

When I checked the details, I found that most of them were for the epoch column.

The statistics on table column public.table_name.column_name
Is it possible that the statistics of the epoch column are not retrieved by simply retrieving the statistics when the table is loaded?

I will try to retrieve the statistics again according to the Management Console.
If you know the cause of the output, please reply.

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited January 2021

    Pretty sure that Vertica does not collect stats on the Epoch columns in projections.

    The WLA rule is simply querying the dc_optimizer_events for PREDICATE OUTSIDE HISTOGRAM anf NO HISTOGRAM events.

    For it to report that you need analyze stats on the Epoch column, somone must be using the EPOCH column as a predicate?

    Example:

    dbadmin=> select time,
    dbadmin->        event_type as observation_description,
    dbadmin->        p.anchortable as table_id,
    dbadmin->        transaction_id, statement_id,
    dbadmin->        schemaname || '.' || anchortablename || '.' || pc.name as tuning_parameter,
    dbadmin->        'analyze statistics on table column '
    dbadmin->            || schemaname || '.' || anchortablename || '.' || pc.name as tuning_description,
    dbadmin->        'select ' || suggested_action as tuning_command,
    dbadmin->        'MEDIUM' as tuning_cost
    dbadmin-> from dc_optimizer_events dcoe, vs_projection_columns pc, vs_projections p
    dbadmin-> where (event_type = 'PREDICATE OUTSIDE HISTOGRAM' or event_type = 'NO HISTOGRAM')
    dbadmin->       and event_oid = pc.oid and pc.proj=p.oid
    dbadmin->       and not exists (select end_time from dc_analyze_statistics
    dbadmin(>                       where event_oid = proj_column_oid and end_time > dcoe.time);
     time | observation_description | table_id | transaction_id | statement_id | tuning_parameter | tuning_description | tuning_command | tuning_cost
    ------+-------------------------+----------+----------------+--------------+------------------+--------------------+----------------+-------------
    (0 rows)
    
    dbadmin=> CREATE TABLE test_tuning (c1 INT);
    CREATE TABLE
    
    dbadmin=> INSERT INTO test_tuning SELECT 1;
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> SELECT analyze_statistics('test_tuning');
     analyze_statistics
    --------------------
                      0
    (1 row)
    
    dbadmin=> SELECT * FROM test_tuning WHERE c1 = 2;
     c1
    ----
    (0 rows)
    
    dbadmin=> select --time,
    dbadmin->        event_type as observation_description,
    dbadmin->        --p.anchortable as table_id,
    dbadmin->        --transaction_id, statement_id,
    dbadmin->        --schemaname || '.' || anchortablename || '.' || pc.name as tuning_parameter,
    dbadmin->        --'analyze statistics on table column '
    dbadmin->        --    || schemaname || '.' || anchortablename || '.' || pc.name as tuning_description,
    dbadmin->        'select ' || suggested_action as tuning_command,
    dbadmin->        'MEDIUM' as tuning_cost
    dbadmin-> from dc_optimizer_events dcoe, vs_projection_columns pc, vs_projections p
    dbadmin-> where (event_type = 'PREDICATE OUTSIDE HISTOGRAM' or event_type = 'NO HISTOGRAM')
    dbadmin->       and event_oid = pc.oid and pc.proj=p.oid
    dbadmin->       and not exists (select end_time from dc_analyze_statistics
    dbadmin(>                       where event_oid = proj_column_oid and end_time > dcoe.time);
       observation_description   |                   tuning_command                    | tuning_cost
    -----------------------------+-----------------------------------------------------+-------------
     PREDICATE OUTSIDE HISTOGRAM | select analyze_statistics('public.test_tuning.c1'); | MEDIUM
    (1 row)
    
    dbadmin=> SELECT epoch, c1 FROM test_tuning;
     epoch | c1
    -------+----
      2309 |  1
    (1 row)
    
    dbadmin=> SELECT * FROM test_tuning WHERE epoch = 2;
     c1
    ----
    (0 rows)
    
    dbadmin=> select --time,
    dbadmin->        event_type as observation_description,
    dbadmin->        --p.anchortable as table_id,
    dbadmin->        --transaction_id, statement_id,
    dbadmin->        --schemaname || '.' || anchortablename || '.' || pc.name as tuning_parameter,
    dbadmin->        --'analyze statistics on table column '
    dbadmin->        --    || schemaname || '.' || anchortablename || '.' || pc.name as tuning_description,
    dbadmin->        'select ' || suggested_action as tuning_command,
    dbadmin->        'MEDIUM' as tuning_cost
    dbadmin-> from dc_optimizer_events dcoe, vs_projection_columns pc, vs_projections p
    dbadmin-> where (event_type = 'PREDICATE OUTSIDE HISTOGRAM' or event_type = 'NO HISTOGRAM')
    dbadmin->       and event_oid = pc.oid and pc.proj=p.oid
    dbadmin->       and not exists (select end_time from dc_analyze_statistics
    dbadmin(>                       where event_oid = proj_column_oid and end_time > dcoe.time);
       observation_description   |                     tuning_command                     | tuning_cost
    -----------------------------+--------------------------------------------------------+-------------
     PREDICATE OUTSIDE HISTOGRAM | select analyze_statistics('public.test_tuning.c1');    | MEDIUM
     NO HISTOGRAM                | select analyze_statistics('public.test_tuning.epoch'); | MEDIUM
     PREDICATE OUTSIDE HISTOGRAM | select analyze_statistics('public.test_tuning.epoch'); | MEDIUM
    (3 rows)
    
  • yamakawayamakawa Vertica Customer
    edited January 2021

    Thanks for the reply.

    I double-checked the process, but there is no part where the EPOCH column is explicitly written as a predicate.

    I compared the time in dc_optimizer_events.time and vertica.log, and the process before and after the time is as follows.

    1. truncate table mytable;
    2. copy mytable from stdin delimiter U&'\0001' record terminator E'[EOL]\n' no escape abort on error;
      begin txn
    3. time at dc_optimizer_events.time
    4. copy mytable from stdin delimiter U&'\0001' record terminator E'[EOL]\n' no escape abort on error;
      Starting commit
    5. select get_num_accepted_rows();
    6. select analyze_statistics('mytable', 10);

    Is there any process that refers to the EPOCH column internally?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file