partitioning vs many tables

Ravi_Vijay · October 2013

Hi All, We are trying to devise schema in vertica for storing data for our application. There are a large number of publishers for that application and the confusion is whether to store the data in a single table and have partitioning based on publisher id or store each publisher data in a separate table. There are certain operations like delete , recovery et al which create locks on the entire table, hence data push might disrupt for everyone if the data is stored in a single table. However a single table is more easily manageable. Could someone throw some light on the pros and cons of each ? Thanks, Ravi.

[Deleted User] · October 2013

Hi Ravi,

I think that you are misunderstanding the concept of partition in Vertica. Partition is a property of a table and specifies how data is organized within individual nodes. Vertica offer partition for 2 reasons,

1- To easily clean historic data, so you can drop a partition and reclaim the space very fast.
2- Partition pruning so when Vertica scan the data filter the partitions.

However it is recommended to keep a low number of partition as when partition Vertica separated the partition data in different files. A common partition key is month/year.

As a pro/con of partition by publisher

Pro, you can easily filter the data when doing queries publisher = xxx

Con,

- too many partition ( not sure the number)
- too many active partitions ( you will get ros push back when loading as all the active partition will merge with the starta algorithm.
- delete history data will be with a delete statement, and could take long time and create delete vectors.

Not having the possibility to have partition by publisher, I think having different tables may be your answer, but I would like to know what type of deletes you do? If in fact you do too many deletes, I would recommend you to read the delete performance consideration.

https://my.vertica.com/docs/6.1.x/HTML/index.htm#12704.htm

You also need to analyze what type of queries do you do.

Hope this helps,

Eugenia

We're Moving!

Create My New Community Account Now

partitioning vs many tables

Comments

Leave a Comment