ROS per node per partition
Hello,
I am using Vertica 9.1 with KSAFE(1).
I have a large table that I want to partition by date (I am using hierarchical partitioning for timestamptz column):
PARTITION BY ((my_table.my_timestamptz_column AT TIME ZONE 'UTC')::date)
GROUP BY (
CASE WHEN ("datediff"('year', (my_table.my_timestamptz_column AT TIME ZONE 'UTC')::date, ((now() AT TIME ZONE 'UTC')::timestamptz(6))::date) >= 2)
THEN (date_trunc('year', (my_table.my_timestamptz_column AT TIME ZONE 'UTC')::date))::date
WHEN ("datediff"('month', (my_table.my_timestamptz_column AT TIME ZONE 'UTC')::date, ((now() AT TIME ZONE 'UTC')::timestamptz(6))::date) >= 2)
THEN (date_trunc('month', (my_table.my_timestamptz_column AT TIME ZONE 'UTC')::date))::date
ELSE (my_table.my_timestamptz_column AT TIME ZONE 'UTC')::date END)
REORGANIZE;
For each partition and each projection (my_projection_b0 and my_projection_b1) Vertica creates 4 ROS conatiners per node. As I understand from documentation, it is supposed to be 1 ROS container per node per partition per projection (after tuple mover handles any inserted/updated data). I have the number which is 4 times higher then expected. I check it with the query:
SELECT COUNT (DISTINCT ros_id) NumROS, node_name FROM PARTITIONS
WHERE projection_name ilike '%my_projection%'
GROUP BY node_name
ORDER BY node_name;
In this case I get 8 ROS per partition per node (but I think this is because of KSAFE(1) which gives my_projection_b0 and my_projection_b1), which is fine. But why 4?
Thank you.
Answers
The number of ROS containers depends on your data. Your 2016 data and prior years will be in ROS containers per year. Your June and July data for 2018 will be in ROS containers per day and everything else in 2017 and 2018 will be in a ROS container per month. Things will continue to shift over time based on the CASE statement used.
A good way to view your data is with the dump_table_partition_keys metafunction. It will show you how each partition key has been placed within each ROS container.
Thank you DaveT,
I have loaded some test data which is a part of 2007 year, so it is supposed to be stored in one container per node, but I got 8 (4+4), for that. Below is a part of dump_table_partition_keys result (just for one node):
Partition keys on node v_test_node0001
Projection 'projection_test_01_b0'
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
Projection 'projection_test_01_b1'
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
Storage [ROS container]
No of partition keys: 122
Partition keys: 2007-08-01, 2007-08-02, 2007-08-03, 2007-08-04, ...
In the quick example below, I am getting 1 ROS container for each Partition Key by node:
Thank you for testing.
As I can see there are different partition keys per different ROS container. Which I think is fine (merged by month). In my case there are 4 containers with the same keys which is confusing.
Every time I load data to the table it creates 4 new containers with the same keys.
@Sergey_T - I did a larger test of inserting a lot of 2007 data.
Thank you Jim_Knicely,
I was looking to find something like you got, what is supposed to be according to the documentation. Do you have any idea why Vertica can create more containers as happened in my case?