Understanding table partitions and ROS containers.

Why does Vertica impose a limit on the number of partitions? Why does it recommend holding partition counts to a low value? Can I allow more partitions?

Comments

  • Let us start with discussing ROS containers, without partitions. For illustration, think of a ROS container as a file. In Vertica, once a ROS file is written, it is NEVER re-written. The following actions create more ROS files: 1. A COPY DIRECT statement. 2. An INSERT, DELETE or UPDATE statement with the /*+direct*/ hint. 3. A Moveout of WOS data to ROS by the Tuple Mover 4. A Mergeout from a number of ROS files into a new ROS file. 5. The MERGE_PARTITION() function. The last two operations also remove ROS files and lower the total count. Vertica imposes a limit of 1024 ROS Containers per projection. This seems low, but in some contexts Vertica has to open many or all of the ROS Containers at once. In each ROS container, there is one operating system file per column (this is the columnar nature of Vertica). Vertica must avoid hitting the system limit on open files, and also supply enough memory for buffers for the open files. The ROS Container limit helps Vertica operate in a range where it is not in danger of hitting these limits. While having more ROS containers increases opportunities for parallelism in query execution, growing the counts beyond the number of CPU cores per node is unlikely to increase query speed. A partitioned table adds further pressure on the number of files, because the data for each partition is stored in its own ROS container. So the number of partition keys is another multiplier on the number of files needed. There is always at least one file per partition key in the table, so for 1024 partitions keys you need a minimum of 1024 ROS containers. Therefore, the limit on partition keys reflects the limit on ROS containers. Even with a smaller number of partitions than 1024, you can run into ROS container limitations quickly, for example, by loading into many partitions at once in concurrent streams. The recommended partition key count of a few 10s keeps Vertica operating an a range where it is not likely to reach the ROS container limit. The MERGE_PARTITIONS function tells Vertica to store more than one partition key per ROS container. If you allow partition counts to grow close to the limit, you will find that you need to actively manage the ROS containers using MERGE_PARTITIONS to avoid having too many ROS containers. Operate at the recommended partition count to avoid this manual intervention.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file