We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Inquiries on the LINEAR_REG function — Vertica Forum

Inquiries on the LINEAR_REG function

jquanjquan Community Edition User

I am interested in using the machine learning functions for linear regression, mainly:

  • LINEAR_REG
  • PREDICT_LINEAR_REG

I have a table that has these columns:

  • time_epoch (numeric) – denoting a "x" the independent variable
  • score (numeric) - denoting a "y" the dependent variables
  • location (VARCHAR) – for each value of this variable, the values of (x,y) are the linear relationship I want to model and predict on
    Example:
dbadmin=> select * from test.my_table;
    time_epoch (x) |   location    |    score (y)
-------------------+------------------------------------+
 1591754400.000000 |       AAAA    |  0.0427034187223203
 1591758000.000000 |       AAAA    |  0.0447975381705089
 1591761600.000000 |       AAAA    |  0.0444945871539881
 1591765200.000000 |       AAAA    |  0.0368878016429624
 1591768800.000000 |       AAAA    |  0.0478755857529369

 1591754400.000000 |       BBBB    |  0.0297034187223203
 1591758000.000000 |       BBBB    |  0.0283424324324324
 1591761600.000000 |       BBBB    |  0.0274945871539881
 1591765200.000000 |       BBBB    |  0.0268878016429624
 1591768800.000000 |       BBBB    |  0.0268755857529369

 1591754400.000000 |       CCCC    |  0.0392133432432432
 1591758000.000000 |       CCCC    |  0.0372343243243234
 1591761600.000000 |       CCCC    |  0.0363143243243243
 1591765200.000000 |       CCCC    |  0.0354382943203243
 1591768800.000000 |       CCCC    |  0.0342343243243243

 1591754400.000000 |       DDDD    |  0.0213213213213213
 1591758000.000000 |       DDDD    |  0.0202132132132132
 1591761600.000000 |       DDDD    |  0.0181321321321321
 1591765200.000000 |       DDDD    |  0.0162132132132132
 1591768800.000000 |       DDDD    |  0.0123213213232321

I want to use LINEAR_REG function to predict the "score" column (i.e., "y" values) at several time_epochs (i.e., "x" values) for each location.
Note: The values of "x" for which we have data for AND for what to predict "y" for will be the same for all locations.

I have 2 questions:
Question 1
From documentation, I cannot use non-numeric values, but is there a way around it? I would like to do something like:

  • **score **column as the "response column"
  • time_epoch, and location as the "predictor columns"
    Invocation example:
SELECT LINEAR_REG('my_prediction', 'test.my_table', 'score', 'time_epoch,location' USING PARAMETERS optimizer='BFGS');

Question 2
Is there a function that can predict linear regression without having to perform two calls (LINEAR_REG and PREDICT_LINEAR_REG). Ideally, I am looking for a single function call for both.

Thanks for your help!

Answers

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file