Inverse statistical distributions (z, t, chi-square)?
Does Vertica have a way to calculate p-values from common test statistics, such as Z, T, or Chi-Square tests?
e.g. INVERSE_T(t, df)
?
Alternatively, what's the best hello-world example to create a custom scalar function in Python or Java?
Tagged:
0
Answers
currently we dont have any functions to calculate p-value or perform t-tests.
With regards to python UDX, please find sample example below. Please download add2ints.py from the github
https://github.com/vertica/UDx-Examples/blob/master/Python/add2ints/add2ints.py
eonv1203=> CREATE OR REPLACE LIBRARY pylib AS '/home/dbadmin/add2ints.py' LANGUAGE 'Python';
CREATE LIBRARY
eonv1203=> CREATE OR REPLACE FUNCTION add2ints AS LANGUAGE 'Python' NAME 'add2ints_factory' LIBRARY pylib fenced;
CREATE FUNCTION
eonv1203=> CREATE TABLE bunch_of_numbers (product_id int, numbs_1 int, numbs_2 int);
CREATE TABLE
eonv1203=> COPY bunch_of_numbers FROM STDIN;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
eonv1203=> SELECT numbs_1, numbs_2, add2ints(numbs_1, numbs_2, product_id) AS add2ints_sum FROM bunch_of_numbers;
numbs_1 | numbs_2 | add2ints_sum
---------+---------+--------------
10 | 10 | 20
6 | 6 | 12
30 | 144 | 174
1 | 4 | 5
(4 rows)
eonv1203=>
eonv1203=> select * from user_functions where function_name ilike '%2ints%';
-[ RECORD 1 ]----------+---------------------------------------------------
schema_name | public
owner | dbadmin
function_name | add2ints
procedure_type | User Defined Function
function_return_type | Integer
function_argument_type | Integer, Integer, Integer
function_definition | Class 'add2ints_factory' in Library 'public.pylib'
volatility | volatile
is_strict | f
is_fenced | t
comment |
I posted an example to add the NumPy implementation of FFT as a UDX at https://github.com/bryanherger/vertica-python-fft
You could adapt this to include other statistical functions from Python packages as Vertica UDX.
If you can run these tests client side, the VerticaPy library implements a number of analytic and ML functions: https://www.vertica.com/python/
@Bryan_H This is perfect. Getting Python going took a little investment but this works.