We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Grouping with moving_average or ARIMA — Vertica Forum

Grouping with moving_average or ARIMA

Trying to use Vertica ML to do some forecasting on some infrastructure metrics. In the documentation for moving_average and ARIMA, there is a somewhat simple example of using this with a table consisting of "time" and "value", the example is supposed to be for temperatures in Melbourne, like

    time         | Temperature

---------------------+-------------
1981-01-01 00:00:00 | 20.7
1981-01-02 00:00:00 | 17.9

Tried this with sample data and it works fine. But in my environment, my data contains data from several different entities that I would like to forecast individually. For easy comparison with the example, we could say a table that contained temperatures for different cities.

time, city, temperature

How would I use moving_average or ARIMA to forecast on this, but separate this based on "city"?

The input-relation field, does not seem to support any grouping.

Answers

  • VValdarVValdar Vertica Employee Employee
    edited November 2023

    Hi lrng,

    Unfortunately, you will need to loop through all your distinct cities values and build a model for each.

    Something like this - untested and you'll probably need version 23.4 to train a model inside a stored procedure, but that's the logic:

    create or replace procedure sp_train_model ()
    language PLvSQL
    security definer
    as
    $$
    declare
        v_city   varchar;
        v_cnt    integer not null :=0;
    begin
        for v_city in query select distinct city from MyTable order by 1
        loop
            perform create or replace view v_MyTable_City as select * from MyTable where city = v_city;
            execute 'select arima(''arima_MyTable_' || v_city || ''', ''v_MyTable_City'', ''temperature'', ''time'' using parameters p=2, q=2)';
            v_cnt := v_cnt+1;
        end loop;
        raise info '% models has been trained', v_cnt;
    end;
    $$;
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file