Table Design With Many Partitions
Hello,
I'm doing some performance testing on a new schema design I am trying out that is essentially a really wide sparse table. I would like to be able to quickly delete sections of this table so I would naturally lean towards partitions to get this done and issuing a "DROP_PARITION" call when needed.
However, I noticed something in the system limits page that says a table can have no more than 1024 partitions but to keep it aroun 10-20 for good performance. I am wondering if anyone knows if this is a soft limit that could potentially be increased, and if so, what could the potential performance implications be of having potentially millions of partitions?
Any input is appreicated. Thank you.
1
Comments
It is actually related to the ROS containers, having many partitions = having many ROS containers and Vertica limit to 1024 ROS containers per table.
Each partition has at least 1 ROS container that is why the limit is 1024.
The limit on partition keys reflects the limit on ROS containers. Even with a smaller number of partitions than 1024, you can run into ROS container limitations quickly, for example, by loading into many partitions at once in concurrent streams making one martition to have multiple ROS containers.
I sudgesst you try to look for subpartitions strategy
Note:
- There are no subpartitions in Vertica, but you can use the concept by creating "buckets".
If you have the table i could create a small example of this might look like.
TBL
|__ Part1
| |__bkt1(generated value from you partition key or other value contained in that part volume)
| |__bkt2
|__Part 2
|__bkt1
|__bkt2