Partitioning Strategey

I have table which has more than a milion records ingested per day . In the table I have a Timestamp_Day column which only holds the date part of day.

Also I need to be able to drop records older than 180 days on any day.

If the table is partitioned by TimeStamp_Day column which is of Type Date, will I exceed the recomended limit of 40 ROS containers per projection?

for the current data I have which is only for 2 different days I see there are already 16 ROS containers

what is the best Partitioning strategey for this kind of a table?

Answers

  • moshegmosheg Vertica Employee Administrator

    If you partition the table by the Timestamp_Day column using hierarchical partitioning with the CALENDAR_HIERARCHY_DAY function, as described in the Vertica documentation example below, it is less likely that you will exceed 40 ROS containers per projection.

    The CALENDAR_HIERARCHY_DAY function organizes date partitions into a hierarchy of groups, effectively consolidating older partitions into larger time spans (such as years and months). This dynamic grouping helps manage the number of ROS containers efficiently, even when dealing with a large volume of daily ingested records.

    See: https://docs.vertica.com/24.1.x/en/admin/partitioning-tables/hierarchical-partitioning

  • VValdarVValdar Vertica Employee Employee

    Hi prad33p2301,

    180 partitions would be acceptable with a daily partition.

    You could go with a group by partition, keep daily partition for xx days then move to weekly partition.
    And instead dropping one partition par day after 180 days, drop the whole week.
    For example with 28 daily + 22 weekly you get a 50 ROS table handling up to 182 days. Seems a good direction!

  • Thank @mosheg , @VValdar Considering that I will have 5M records a day , what will be the impact on the read and write performance when I drop a partition whichis having 5M records and what will be the case when I drop a weeks worth of data which will be 35M records

  • VValdarVValdar Vertica Employee Employee

    5M rows a day is not that much, maybe weekly partition is good enough in all cases.
    Dropping a partition is more or less instantaneous whatever it's size; it's a catalog operation + removing files, very low load for the database.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file