We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


partitioning vs many tables — Vertica Forum

partitioning vs many tables

Hi All, We are trying to devise schema in vertica for storing data for our application. There are a large number of publishers for that application and the confusion is whether to store the data in a single table and have partitioning based on publisher id or store each publisher data in a separate table. There are certain operations like delete , recovery et al which create locks on the entire table, hence data push might disrupt for everyone if the data is stored in a single table. However a single table is more easily manageable. Could someone throw some light on the pros and cons of each ? Thanks, Ravi.

Comments

  • Hi Ravi, 

    I think that you are misunderstanding the concept of partition in Vertica. Partition is a property of a table and specifies how data is organized within individual nodes. Vertica offer partition for 2 reasons, 

    1- To easily clean historic data, so you can drop a partition and reclaim the space very fast. 
    2- Partition pruning so when Vertica scan the data filter the partitions. 

    However it is recommended to keep a low number of partition as when partition Vertica separated the partition data in different files. A common partition key is month/year. 

    As a pro/con of partition by publisher 

    Pro, you can easily filter the data when doing queries publisher = xxx

    Con, 

    - too many partition ( not sure the number)
    - too many active partitions ( you will get ros push back when loading as all the active partition will merge with the starta algorithm.
    - delete history data will be with a delete statement, and could take long time and create delete vectors. 

    Not having the possibility to have partition by publisher, I think having different tables may be your answer, but I would like to know what type of deletes you do? If in fact you do too many deletes, I would recommend you to read the delete performance consideration. 

    https://my.vertica.com/docs/6.1.x/HTML/index.htm#12704.htm

    You also need to analyze what type of queries do you do. 

    Hope this helps, 

    Eugenia


Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file