Options

Effect of partitions on performance of queries not using the partition key

Hi,

I understand that that table partitions are created to improve performance of deletes/easy archiving/pruning.

If I have a query that does not use a partition key, what happens to the query performance? In Teradata, the query performance will degrade if you do not use the partition key in your query.  How does it work in Vertica?

Also pls help me understand relationship between files and containers.  I know that there will be one file for each column and one container holds one partition.  Are there any size limits per file or container?

Thanks
Rao

Comments

  • Options
    Hi

    Kindly find answer your question as below :

    1) If I have a query that does not use a partition key, what happens to the query performance?
    Answer: Partitions can improve parallelism during query execution and also enable some optimizations that can improve query performance hence it will surely impact the performance in case partition key being not used.

    2) Also pls help me understand relationship between files and containers.I know that there will be one file for each column and one container holds one partition.  Are there any size limits per file or container?
    Answer: For simple understanding think of a ROS container as a file.  In Vertica, once a ROS file is written, it is NEVER re-written.  
    Vertica imposes a limit of 1024 ROS Containers per projection.  This seems low, but in some contexts Vertica has to open many or all of the ROS Containers at once.  In each ROS container, there is one operating system file per column (this is the columnar nature of Vertica).  Vertica must avoid hitting the system limit on open files, and also supply enough memory for buffers for the open files.  The ROS Container limit helps Vertica operate in a range where it is not in danger of hitting these limits.  While having more ROS containers increases opportunities for parallelism in query execution, growing the counts beyond the number of CPU cores per node is unlikely to increase query speed.
    A partitioned table adds further pressure on the number of files, because the data for each partition is stored in its own ROS container.  So the number of partition keys is another multiplier on the number of files needed.
    There is always at least one file per partition key in the table, so for 1024 partitions keys you need a minimum of 1024 ROS containers.  Therefore, the limit on partition keys reflects the limit on ROS containers.

    Regards
    Rahul Choudhary
  • Options
    Thank you.

    can you pls explain in detail how performance will degrade if I do not use the partition key  in my queries?

    The 1024 limit applies only to the partitions? or projections as well?  I thought it applies only to the partitions.

    Is there any criteria Vertica recommends to determine a table as a candidate for partitioning?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file