How Partition is different in Vertica compare to Oracle and managing partitions in vertica
I am trying to build the similar tables of Oracle with partitions and snapshot by month which is partitions in Vertica. Not sure if I can do same partitions in Vertica and does partitions will stored in each node or each node will have different partitions. Query performance will improve or degrade.
Can someone please share your ideas or thoughts.
Can someone please share your ideas or thoughts.
0
Comments
https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/AdministratorsGuide/Partitions/Partitioni...
On the top if created the projections will it work with the partition table and does it degrade the performance ?
Starting from you first posts's question Partitions in Vertica work like any other partitions in other database with some limited functionality
( not with all the functionality like Oracle) . You can move partitions, drop partitions.
Partition storage on each / or only one depends on the table data distribution.
There can be three scenario where I can see such questions arising
1. When table is segmented on all nodes
2. When table is unsegmented on all nodes
3. When table is segmented only on one node( all data only on one node)
Lets go through what will happen in each scenario.
Example :
We have three tables according to above scenario:
Table A - segmented on all nodes
Table B - unsegmented on all nodes
Table C - segmented only on one node
Partition on all tables with monthly expression on a date column.
Data in all tables from Nov 2011 to July 2012
1. In Table A the there will be 9 partitions on each node. The 9 partitions will be similar on each node, besides the data will be different on each node ( remember - the data is segmented on some unique ID and not the date column )
2. In Table B there will be 9 partitions again on each node. The 9 partition will be similar on each node and the data will also be similar on each node's 9 partitions.
3. In Table C there will be 9 partitions on the node where table is segmented and there will be all data which is partitioned in this node only.
Coming to second question Guidelines :
1. Take care you do not exceed partitions by 1024 ( this is hard limit on each node)
2. If you are creating partition in each node with date field and monthly expression,perhaps hats the best partition to do in Vertica ( In such a way, you won't exceed the limit)
3. Managing partition is another aspect, and you need to monitor and take care of this (Managing - Droping /Moving /Purging)
Coming to your last question In Vertica Partitioning is a table property and projections are altogether different objects.
So you can create projections on tables, whether or not they have partition, it does not affect the projections.
Partition help in partition pruning ( a feature of vertica wheren scanning of partition which are not in predicate clause are eliminated, thus increasing the performance) , so it is good to have partitions.
Hope this helps.
Which use case you try to address by using partition ? are you plan to use it for purging ? or it for performance ? , for purging you will do benefited for it , performance point in most use cases you will not see much differences .
Thanks again.
Glad that it helped you.
Considering your scenario, where you load into one partition and have replica of same partition available for View, this can help you have same functionality in vertica
Moving partitions
You can use stage tables to do it, Load into stage table and Keep target table always available for views. Move partitions from source / stage table to target tables..
Hope this helps.
Vertica suggest each dimension table to be replicated on all nodes so as to facilitate your joins between Fact and dimensions to be faster ( If the dimensional tables are small enough ).
In Vertica Partitioning provides faster data purging and better query performance by segregating data on each node.
In your case, If you have large dimensional table comparable with your fact table, then you should think of going for partitions on dimension tables and also segmenting it on all nodes.
One more thing Vertica assumes your dimensional tables will be small and accordingly the statement in documentation states it to be without partitions.
Hope this helps.
I will try with adding partition to the dimensional table and see the performance and let you know.