The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!
need to scale my Vertica cluster UP
I need to scale my cluster UP without impact to my current PROD cluster.
I have 2 different options in mind.
1. snapshot current cluster
2. restore snapshot on different cluster with same node count
3. add more nodes to this new cluster
4. rebalance data
5. make new bigger cluster PROD
1. create new blank Vertica cluster with more nodes
2. copy data from PROD cluster to new cluster using
EXPORT TO VERTICA db.schema.table AS SELECT * FROM schema.table
3. make new bigger cluster PROD
Any advice on which option is better/faster ?
Any other methods available ?
I would take a full backup of my cluster and add a new node to the cluster , next re-balance the data across !
This does not require you to stop the database and re-balance can be done in low activity time.
What is important is that I am planning to build new cluster and when it is ready I will move queries to new cluster.
This way my PROD operations will be not impacted
This would be my approach.
- Export you metadata and recreate it so that will comply with your chosen k-safety on the new cluster.
- Create a database user on the production database(with full access on its objects/tables).
- Create a specific resource pool that will not influence the performance of your production database when it will be used, and give access to this pool to the created user.
- Using the connect(connect with the new created user)+ copy from command, copy your data from one database to another.
My cluster work and Vertica doing rebalancing should not impact performance. Query response time is critical for us.
This is why I want to build new cluster in parallel with working.
When new cluster ready I will do benchmarks and make sure new cluster will perform better and only ofter that move operations to new cluster.
Also I am trying to get feel of which method will take less time.
I will do it in AWS and I want to run 2 clusters in parallel for shortest possible period of time.
This way I will try to keep AWS bill as low as possible.
my cluster is 9 nodes now.
I am thinking about building 15 node cluster.
What will take less time ? rebalancing cluster or copying same data from one cluster to another ?
It also is more tolerant of failure -- if interrupted, it can restart from where it left off.
rebalance can be slower if you have a massive cluster, such as going from 100 -> 200 nodes, but that's not your situation.