Options

Inverse statistical distributions (z, t, chi-square)?

Does Vertica have a way to calculate p-values from common test statistics, such as Z, T, or Chi-Square tests?
e.g. INVERSE_T(t, df)?

Alternatively, what's the best hello-world example to create a custom scalar function in Python or Java?

Answers

  • Options
    SruthiASruthiA Vertica Employee Administrator

    currently we dont have any functions to calculate p-value or perform t-tests.

    With regards to python UDX, please find sample example below. Please download add2ints.py from the github

    https://github.com/vertica/UDx-Examples/blob/master/Python/add2ints/add2ints.py

    eonv1203=> CREATE OR REPLACE LIBRARY pylib AS '/home/dbadmin/add2ints.py' LANGUAGE 'Python';
    CREATE LIBRARY
    eonv1203=> CREATE OR REPLACE FUNCTION add2ints AS LANGUAGE 'Python' NAME 'add2ints_factory' LIBRARY pylib fenced;
    CREATE FUNCTION
    eonv1203=> CREATE TABLE bunch_of_numbers (product_id int, numbs_1 int, numbs_2 int);
    CREATE TABLE
    eonv1203=> COPY bunch_of_numbers FROM STDIN;
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.

    200|10|10
    300|1|4
    400|6|6
    100|30|144
    .

    eonv1203=> SELECT numbs_1, numbs_2, add2ints(numbs_1, numbs_2, product_id) AS add2ints_sum FROM bunch_of_numbers;
    numbs_1 | numbs_2 | add2ints_sum
    ---------+---------+--------------
    10 | 10 | 20
    6 | 6 | 12
    30 | 144 | 174
    1 | 4 | 5
    (4 rows)

    eonv1203=>

    eonv1203=> select * from user_functions where function_name ilike '%2ints%';
    -[ RECORD 1 ]----------+---------------------------------------------------
    schema_name | public
    owner | dbadmin
    function_name | add2ints
    procedure_type | User Defined Function
    function_return_type | Integer
    function_argument_type | Integer, Integer, Integer
    function_definition | Class 'add2ints_factory' in Library 'public.pylib'
    volatility | volatile
    is_strict | f
    is_fenced | t
    comment |

  • Options
    Bryan_HBryan_H Vertica Employee Administrator

    I posted an example to add the NumPy implementation of FFT as a UDX at https://github.com/bryanherger/vertica-python-fft
    You could adapt this to include other statistical functions from Python packages as Vertica UDX.
    If you can run these tests client side, the VerticaPy library implements a number of analytic and ML functions: https://www.vertica.com/python/

  • Options

    @Bryan_H This is perfect. Getting Python going took a little investment but this works.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file