Grouping with moving_average or ARIMA
Trying to use Vertica ML to do some forecasting on some infrastructure metrics. In the documentation for moving_average and ARIMA, there is a somewhat simple example of using this with a table consisting of "time" and "value", the example is supposed to be for temperatures in Melbourne, like
time | Temperature
---------------------+-------------
1981-01-01 00:00:00 | 20.7
1981-01-02 00:00:00 | 17.9
Tried this with sample data and it works fine. But in my environment, my data contains data from several different entities that I would like to forecast individually. For easy comparison with the example, we could say a table that contained temperatures for different cities.
time, city, temperature
How would I use moving_average or ARIMA to forecast on this, but separate this based on "city"?
The input-relation field, does not seem to support any grouping.
Answers
Hi lrng,
Unfortunately, you will need to loop through all your distinct cities values and build a model for each.
Something like this - untested and you'll probably need version 23.4 to train a model inside a stored procedure, but that's the logic: