Partitioning tables in Vertica

I need to partition one of our tables. The table currently has approximately few millions of rows and is expected to grow fast. The main reason for partitioning is to improve the query performance. Almost all the queries on this table use below 2 fields, 1. Primary key field - MessageId 2. Timestamp field - Messagetimestamp I have worked with other databases and have created time based partitions. I need some inputs from experts on this forum. a. Can I use primary key field for partitioning tables in Vertica? b. Should I use some kind of a function (Mod) on the MessageId to reduce the number of partition keys. I have read about too many partitions causing ROS container issues? Thanks, Vivek.

Comments

  • You should choose a partitioning expression based on time. The Tuple Mover likes to maintain "non active" partitions as a single ROS container or close to a single ROS container. It determines active partitions based on recent loads, so you want your new loads to be going into the a single partition or a very small number of partitions in order to avoid unnecessary background work by the Tuple Mover. With a time based expression, it's typical to partition by month if you are keeping years worth of data. --Sharon

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file