Auto-scaling availability
kxu
Administrator
When will the auto-scale feature be available?
Sizing and Configuring Vertica in Eon Mode for Different Use Cases
@skeswani @skamat
Tagged:
0
Answers
As of 4/2/2020, auto-scaling is on our roadmap without a specific date. May people ask for it thinking that it would just be a feature to turn on but it's actually a bit more complex and I would appreciate hearing your requirements. For example, do you expect to scale by adding nodes to a subcluster or by adding subclusters to a workload? Would you like to establish an absolute maximum number of noded/subclusters to control costs? What about a minimum number to prevent it from shutting off? How rapidly would you like to scale, immediately or only once it is clear that the increase is not just a short spike? And, of course, there are questions about scaling down as well. I will watch here for your comments. Thanks!
Auto scaling is a operational function. Scaling is a product feature.
you can scale a vertica cluster by adding nodes and/or subcluster trivially but running a single admintools command.
To autoscale your devops scripts need to call these functions based on your use case and needs.
at the most trivial level, you can use a cron deamon to call these commands at 9am and 5pm and scale up/down
at a more complex level you can create a cloud-formation script to achieve this.
Please reach out to us if you need assistance in how to use vertica commands to scale up/down your cluster and how to plumb them into your specific solution.
Let me elaborate my comments...
1.
The auto-scaling functionality provided by AWS will terminate an instance during scale-down and provision a brand new one during scale-up. This leads to an interesting problem. When a new instance is started as part of the scale-up it does not have the dbadmin's ssh keys, and hence cannot join a existing cluster.
you can obviously bake such a ssh key into the AMI, but then it will need to be your "the customers" key and not the ami providers (verticas) keys. i.e. we/vertica cannot provide a ami which auto scales since we cannot use a common ssh key, and we dont have "your" (the customers) ssh key.
This is indeed a minor problem but an important one, which prevents out-of-the-box use of ami's for auto-scale, esp for software that has a notion of clusters or groups.
Hence in order to use auto-scale, you will need to create your own AMI trivially derived from a vertica ami, with your ssh keys baked into the AMI (either on the root volume or via user-data section). Such an ami can then be used to auto-scale.
during scale-up, as part of the user-data, you will need to run the "admintools -t db_add_subcluster" or "admintools -t db_add_node"
(depending on if you are adding a Auto Scaling group, or scaling up or down an existing autoscale group.)
This too can be done via some sort of life cycle event (or shutdown hook) and/or user-data section (or cloud-init)
Hence autoscaling cannot be provided out-of-box without sharing the notion of the "cluster" you are scaling up or down (i.e. "ssh keys".)
There needs to be an outside orchestration function which add hosts into the "cluster" by proving these keys.
some will want to scale up at 9AM and scale down at 5PM.
others will want to scale up during the week and scale down on the weekend
other will want to do it end of month or quarter,
yet others may do it during the holiday season etc
Here again, a product can merely allow you do scale up/down. but when to scale-up and scale-downs is specific to your application/use case. your dev opts script will need to encode such logic.
Hence auto-scale is a dev-ops function. scaling is a product feature.
vertica currently supports scaling up and down via subcluster and nodes (and has in the past).
Feel free to use it to auto-scale by layering your dev-ops scripts over the scaling functionality