Storage Container Optimization for Tables not partitioned

edited June 2019 in General Discussion

Version: 9.0.x

I see this message even for those tables with no partitions:

table source: v_monitor.query_events
event_type: STORAGE_CONTAINERS_ELIMINATED
event_details: "Using only 3 stores out of 6 for projection myschema.mytable_b0"

how is this possible? Does this mean I get the benefit of pruning even if the table is not partitioned?

Is the pruning coming from how the table's projection is segmented?

If this is true, then what is the difference between pruning from a partitioned table vs a non partitioned table?

Comments

  • Bryan_HBryan_H Vertica Employee Administrator

    Hi, even though the table is not explicitly partitioned, Vertica splits the data into ROS containers based on segmentation, see for example the diagram at https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/Partitions/PartitioningAndSegmentingData.htm
    The catalog records what data are in each ROS container, so the query engine checks the catalog based on predicate and other query qualifiers and only reads from ROS containers that have relevant data according to the catalog.
    Consider this simple reproducer:

    dbadmin=> create table T(a int, b int) order by a,b;
    CREATE TABLE
    dbadmin=> insert /+direct/ into T values(1,1);
    OUTPUT
    --------
    1
    (1 row)

    dbadmin=> insert /+direct/ into T values(3,3);
    OUTPUT
    --------
    1
    (1 row)

    dbadmin=> commit durable;
    COMMIT
    dbadmin=> select clear_data_collector();
    clear_data_collector
    ----------------------
    CLEAR
    (1 row)

    dbadmin=> select * from T where b=1;
    a | b
    ---+---
    1 | 1
    (1 row)

    dbadmin=> select * from dc_execution_engine_events order by time desc;
    time | node_name | session_id | user_id | user_name | transaction_id | statement_id | request_id | event_type | event_description | operator_name | path_id | event_oid | event_details | suggested_action
    -------------------------------+-------------------+---------------------------------+-------------------+-----------+-------------------+--------------+------------+-------------------------------+--------------------------------------------------------------------------------------+---------------+---------+-------------------+------------------------------------------------------------+------------------
    2019-06-18 23:43:45.997238-04 | v_docker_node0001 | v_docker_node0001-17861:0x91a96 | 45035996273704962 | dbadmin | 45035996289316488 | 1 | 6 | STORAGE_CONTAINERS_ELIMINATED | Some storage containers will not be processed because they contain no relevant data. | StorageUnion | 1 | 45035996303337736 | Using only 1 stores out of 2 for projection public.T_super |
    (1 row)

    So what is happening here? The two rows I insert are hashed into different ROS containers:

    dbadmin=> select * from storage_containers where projection_name LIKE 'T%';
    node_name | schema_name | projection_id | projection_name | storage_type | storage_oid | sal_storage_id | total_row_count | deleted_row_count | used_bytes | start_epoch | end_epoch | grouping | segment_lower_bound | segment_upper_bound | is_sorted | location_label | delete_vector_count | shard_id | shard_name
    -------------------+-------------+-------------------+-----------------+--------------+-------------------+--------------------------------------------------+-----------------+-------------------+------------+-------------+-----------+------------+---------------------+---------------------+-----------+----------------+---------------------+----------+------------
    v_docker_node0001 | public | 45035996303337736 | T_super | ROS | 45035996303337773 | 02209959c0505871379fd34d7bcd07b000a0000001c4292b | 1 | 0 | 54 | 3786952 | 3786952 | PROJECTION | 0 | 4294967295 | t | data1 | 0 | 0 |
    v_docker_node0001 | public | 45035996303337736 | T_super | ROS | 45035996303337785 | 02209959c0505871379fd34d7bcd07b000a0000001c42937 | 1 | 0 | 54 | 3786952 | 3786952 | PROJECTION | 0 | 4294967295 | t | data1 | 0 | 0 |
    (2 rows)

    The catalog records that ROS #1 has row (1,1) while ROS#2 has row (3,3) and presumably records the lower and upper bounds of the integers stored in "a" and "b" in each. So when I query for "b=1", the catalog reports that only ROS#1 contains matching data, so we get the STORAGE_CONTAINERS_ELIMINATED operation message.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file