Is it possible to isolate the workload of ML functions from other loads of the Vertica DB?
Asked during the BDC session Model Management and Data Preparation @WD @Arash_Fard
Yes! There are several techniques to use, based on how much isolation you need.
1. Use a resource pool. This approach doesn't require any additional hardware, and can limit how much of the existing cluster ML can use.
2. Use a (secondary) subcluster, if you are using Eon and can provision more machines. These machines will have access to an up-to-date copy of all data in the database, and the processing aspect will be fully isolated from the other Vertica nodes.
3. Use import/export, object backup/restore, or similar, to move the data to another fully isolated cluster. This allows work on a snapshot of the data, which has its pros and cons, depending on what kind of ML you are doing.
ML model generation is already segregated by the BLOBDATA resource pool. I don't know if data preparation functions are also similarly isolated, but Chuck's answer still applies there.