We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Need a match function for Vertica SQL — Vertica Forum

Need a match function for Vertica SQL

slc1axjslc1axj Vertica Customer

I have 1 table in Vertica with a list of codes. Each field has 1 code in it (text) e.g. 123456
My 2nd table in Vertica has a list of codes. Each field can have more than 1 code in it (text) separated by a comma e.g. 123456,522122,345644

I want a query to select all from table 2 and add a column at the end called "Match" with a "Y" if the multiple code field contains any code from table 1

The example shown above should have a match of "Y"

Tagged:
«1

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    What version of Vertica are you running? This is relatively easy in Vertica 10.1.x

    Example:

    verticademos=> SELECT * FROM t1;
       c
    --------
     123456
     555555
     666666
    (3 rows)
    
    verticademos=> SELECT * FROM t2;
               c
    -----------------------
     123456,522122,345644
     231423,666666,124132
     12345,678910,11121314
    (3 rows)
    
    verticademos=> ALTER TABLE t2 ADD COLUMN "Match" VARCHAR(1);
    ALTER TABLE
    
    verticademos=> UPDATE t2 SET "Match" = 'Y' FROM (SELECT c FROM (SELECT c, explode(STRING_TO_ARRAY(c USING PARAMETERS collection_delimiter = ',')) OVER (PARTITION BY c) FROM t2) foo WHERE EXISTS (SELECT NULL FROM t1 WHERE t1.c = foo.value)) foo WHERE foo.c = t2.c;
     OUTPUT
    --------
          2
    (1 row)
    
    verticademos=> SELECT * FROM t2;
               c           | Match
    -----------------------+-------
     12345,678910,11121314 |
     123456,522122,345644  | Y
     231423,666666,124132  | Y
    (3 rows)
    
    
  • slc1axjslc1axj Vertica Customer

    thanks - going to try to plug this in and I'll let you know:)
    not sure what version of Vertica I'm using

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    You can get the version with the VERSION function!

    verticademos=> SELECT version();
                   version
    -------------------------------------
     Vertica Analytic Database v10.1.1-1
    (1 row)
    
  • slc1axjslc1axj Vertica Customer

    I'm getting an error testing this - not good with SQL:)

    SQL Error [4856] [42601]: [Vertica]VJDBC ERROR: Syntax error at or near "verticademos"

    How do I place this in my current query - as follows?

    Select o.*
    from db.t2 o
    where where "Report Date" >= (select max("Report Date") from db.t2)

    I just want to add a field at the end of my columns for the Match column Y/N

  • DaveTDaveT Vertica Employee Employee

    Another variation of that if you just want a query:

    select t2.c, decode(t1.c,null,'N','Y') "Match" from t2 left join t1 on contains(string_to_array(t2.c),t1.c);

           c           | Match
    

    -----------------------+-------
    231423,666666,124132 | Y
    123456,522122,345644 | Y
    12345,678910,11121314 | N
    (3 rows)

    As Jim indicated, the SQL will be slightly different if you are pre-10.1 and pre-10.1.1.

  • slc1axjslc1axj Vertica Customer

    SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function string_to_array(varchar) does not exist, or permission is denied for string_to_array(varchar)

  • slc1axjslc1axj Vertica Customer

    I have the Array function but not string_to_array

  • slc1axjslc1axj Vertica Customer

    Rather than on contains(string to array... - is it possible to use a where statement here?

  • slc1axjslc1axj Vertica Customer

    Thanks for your help directing me - I found the following works:

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from report o
    left join lifesustain l on contains(array[o.'service code'],l.SERVICECODE)

  • slc1axjslc1axj Vertica Customer

    Well, it seems to be matching 1 to 1 but when there are multiples in T2 such as ( 12345,678910,11121314), it isn't matching when the code is in T1 (678910)

  • slc1axjslc1axj Vertica Customer

    What does c represent in this statement?

    UPDATE t2 SET "Match" = 'Y' FROM (SELECT c FROM (SELECT c, explode(STRING_TO_ARRAY(c USING PARAMETERS collection_delimiter = ',')) OVER (PARTITION BY c) FROM t2) foo WHERE EXISTS (SELECT NULL FROM t1 WHERE t1.c = foo.value)) foo WHERE foo.c = t2.c;

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    It is the column name in the sample tables I used (T1 and T2).

  • slc1axjslc1axj Vertica Customer

    So my query is as follows - can you show me how to add in the explode (string to array) which will create another field at the end based on the tables I gave you?

    SELECT *
    from T2

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited June 2021

    You haven't told us the Vertica version yet :)

    First check if you have the EXPLODE function by running \df in vsql:

    verticademos=> \df explode
                             List of functions
     procedure_name | procedure_return_type | procedure_argument_types
    ----------------+-----------------------+--------------------------
     Explode        | Any                   | Any
    (1 row)
    
    

    Or you could run this SQL:

    verticademos=> SELECT function_name FROM user_functions WHERE function_name = 'Explode';
     function_name
    ---------------
     Explode
    (1 row)
    
  • slc1axjslc1axj Vertica Customer
    edited June 2021

    Version: Vertica Analytic Database v10.1.0-2 - Yes, the Explode function is there

  • DaveTDaveT Vertica Employee Employee

    The query I provided will work but you'd need to change the STRING_TO_ARRAY syntax until you get to 10.1.1.

    It will be one of these two (can't recall which version started using different syntax but I gave an example for 10.1.1 which is better and clearer):
    string_to_array('['||t2.c||']',',')
    or
    string_to_array('['||t2.c||']' using parameters collection_delimiter=',')

  • slc1axjslc1axj Vertica Customer

    The string_to_array function was not found? SELECT function_name FROM user_functions WHERE function_name = 'string_to_array';

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    You won't see it in USER_FUNCTIONS.

    Try \df

    verticademos=> \df string_to_array
                             List of functions
     procedure_name  | procedure_return_type | procedure_argument_types
    -----------------+-----------------------+--------------------------
     string_to_array | VarcharArray1D        | Varchar
     string_to_array | VarcharArray1D        | Varchar, Varchar
    (2 rows)
    
  • slc1axjslc1axj Vertica Customer

    \df string_to_array SQL Error [4856] [42601]: [Vertica]VJDBC ERROR: Syntax error at or near "\"

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    You have to run \df in vsql!

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Or just try this:

    SELECT string_to_array('[' || 'Vertica' || ']');
    
  • slc1axjslc1axj Vertica Customer

    SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function string_to_array(varchar) does not exist, or permission is denied for string_to_array(varchar)

  • slc1axjslc1axj Vertica Customer
    edited June 2021

    ok tried this query:

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from wfmgmt_prd.open_report o
    left join WFMGMT_PRD.map_hcpc_lifesustain l on contains(string_to_array('['||o.'service code'||']' using parameters collection_delimiter=','))
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%'

    Getting error: SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function string_to_array(varchar) does not exist, or permission is denied for string_to_array(varchar)

    WFMGMT_PRD.map_hcpc_lifesustain = T1
    wfmgmt_prd.open_report = T2

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Try:

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from wfmgmt_prd.open_report o
    left join WFMGMT_PRD.map_hcpc_lifesustain l on contains(string_to_array('['||o.'service code'||']', ','))
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%'

  • slc1axjslc1axj Vertica Customer

    SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function contains(array[varchar]) does not exist, or permission is denied for contains(array[varchar])

  • slc1axjslc1axj Vertica Customer

    Tried this one too but have error: SQL Error [4856] [42601]: [Vertica]VJDBC ERROR: Syntax error at or near "from"

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from wfmgmt_prd.open_report o
    left join WFMGMT_PRD.map_hcpc_lifesustain l on contains(StringTokenizerDelim(o.'service code',',') over (partition by o.'intake id') from wfmgmt_prd.open_report)
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%';

  • slc1axjslc1axj Vertica Customer

    Found out this will break up the service codes in T2 but how do I write this into the query to code the Match as a 'Y'?

    SELECT v_txtindex.StringTokenizerDelim(o.'service code',',') over()
    from WFMGMT_PRD.open_report o
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%';

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    I have no idea why your STRING_TO_ARRAY function isn't working.

    Anyway, you can use StringTokenizerDelim like the Explode function...

    Example:

    verticademos=> SELECT * FROM t1;
       c
    --------
     678910
     123456
     1
    (3 rows)
    
    verticademos=> SELECT * FROM t2;
               c
    -----------------------
     12345,678910,11121314
     123456,522122,345644
     2,3,4
    (3 rows)
    
    verticademos=> SELECT foo.c, NVL2(t1.c, 'Y', 'N') "Match"
    verticademos->   FROM (SELECT c, v_txtindex.StringTokenizerDelim(c, ',') OVER (PARTITION BY c)
    verticademos(>           FROM t2) foo
    verticademos->   LEFT JOIN t1
    verticademos->     ON foo.words = t1.c
    verticademos->  LIMIT 1 OVER (PARTITION BY foo.c ORDER BY foo.c, NVL2(t1.c, 'Y', 'N') DESC);
               c           | Match
    -----------------------+-------
     12345,678910,11121314 | Y
     123456,522122,345644  | Y
     2,3,4                 | N
    (3 rows)
    

    Send me the DDL for each of your tables and I can rewrite the above for you.

    To get the DDL you need to run these two commands:

    SELECT export_objects('','wfmgmt_prd.open_report');
    SELECT export_objects('','WFMGMT_PRD.map_hcpc_lifesustain');

    Post the results here or send them as a message directly to me.

  • slc1axjslc1axj Vertica Customer

    SQL Error [5297] [0A000]: [Vertica]VJDBC ERROR: Unsupported use of LIMIT/OFFSET clause when running export_objects

    I can tell you that Table 1 = map_hcpc_lifesustain (Varchar(20) field called 'SERVICECODE') - this is a list with single service codes
    Table 2 = open_report (Varchar(500)field called 'service code') - this has multiple service codes with comma between

    I need a select query to add a field when I select all from T2(open_report) that says whether the 'service code' matches to one of the 'SERVICECODE' IN T1(map_hcpc_lifesustain

  • slc1axjslc1axj Vertica Customer

    This seems to be working but it is grouping the 'service code' from Table 2 - I need to show all the rows from table 2 with the Match at the end.

    SELECT foo.'service code', NVL2(l.SERVICECODE, 'Y', 'N') "Match"
    FROM (SELECT o.'service code', v_txtindex.StringTokenizerDelim(o.'service code', ',') OVER (PARTITION BY o.'service code')
    FROM wfmgmt_prd.open_report o
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%') foo
    LEFT JOIN WFMGMT_PRD.map_hcpc_lifesustain l
    ON foo.words = l.SERVICECODE
    LIMIT 1 OVER (PARTITION BY foo.'service code' ORDER BY foo.'service code', NVL2(l.SERVICECODE, 'Y', 'N') DESC)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file