Need a match function for Vertica SQL

I have 1 table in Vertica with a list of codes. Each field has 1 code in it (text) e.g. 123456
My 2nd table in Vertica has a list of codes. Each field can have more than 1 code in it (text) separated by a comma e.g. 123456,522122,345644

I want a query to select all from table 2 and add a column at the end called "Match" with a "Y" if the multiple code field contains any code from table 1

The example shown above should have a match of "Y"

Tagged:
«1

Answers

  • Jim_KnicelyJim_Knicely Administrator

    What version of Vertica are you running? This is relatively easy in Vertica 10.1.x

    Example:

    verticademos=> SELECT * FROM t1;
       c
    --------
     123456
     555555
     666666
    (3 rows)
    
    verticademos=> SELECT * FROM t2;
               c
    -----------------------
     123456,522122,345644
     231423,666666,124132
     12345,678910,11121314
    (3 rows)
    
    verticademos=> ALTER TABLE t2 ADD COLUMN "Match" VARCHAR(1);
    ALTER TABLE
    
    verticademos=> UPDATE t2 SET "Match" = 'Y' FROM (SELECT c FROM (SELECT c, explode(STRING_TO_ARRAY(c USING PARAMETERS collection_delimiter = ',')) OVER (PARTITION BY c) FROM t2) foo WHERE EXISTS (SELECT NULL FROM t1 WHERE t1.c = foo.value)) foo WHERE foo.c = t2.c;
     OUTPUT
    --------
          2
    (1 row)
    
    verticademos=> SELECT * FROM t2;
               c           | Match
    -----------------------+-------
     12345,678910,11121314 |
     123456,522122,345644  | Y
     231423,666666,124132  | Y
    (3 rows)
    
    
  • thanks - going to try to plug this in and I'll let you know:)
    not sure what version of Vertica I'm using

  • Jim_KnicelyJim_Knicely Administrator

    You can get the version with the VERSION function!

    verticademos=> SELECT version();
                   version
    -------------------------------------
     Vertica Analytic Database v10.1.1-1
    (1 row)
    
  • I'm getting an error testing this - not good with SQL:)

    SQL Error [4856] [42601]: [Vertica]VJDBC ERROR: Syntax error at or near "verticademos"

    How do I place this in my current query - as follows?

    Select o.*
    from db.t2 o
    where where "Report Date" >= (select max("Report Date") from db.t2)

    I just want to add a field at the end of my columns for the Match column Y/N

  • DaveTDaveT Employee

    Another variation of that if you just want a query:

    select t2.c, decode(t1.c,null,'N','Y') "Match" from t2 left join t1 on contains(string_to_array(t2.c),t1.c);

           c           | Match
    

    -----------------------+-------
    231423,666666,124132 | Y
    123456,522122,345644 | Y
    12345,678910,11121314 | N
    (3 rows)

    As Jim indicated, the SQL will be slightly different if you are pre-10.1 and pre-10.1.1.

  • SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function string_to_array(varchar) does not exist, or permission is denied for string_to_array(varchar)

  • I have the Array function but not string_to_array

  • Rather than on contains(string to array... - is it possible to use a where statement here?

  • Thanks for your help directing me - I found the following works:

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from report o
    left join lifesustain l on contains(array[o.'service code'],l.SERVICECODE)

  • Well, it seems to be matching 1 to 1 but when there are multiples in T2 such as ( 12345,678910,11121314), it isn't matching when the code is in T1 (678910)

  • What does c represent in this statement?

    UPDATE t2 SET "Match" = 'Y' FROM (SELECT c FROM (SELECT c, explode(STRING_TO_ARRAY(c USING PARAMETERS collection_delimiter = ',')) OVER (PARTITION BY c) FROM t2) foo WHERE EXISTS (SELECT NULL FROM t1 WHERE t1.c = foo.value)) foo WHERE foo.c = t2.c;

  • Jim_KnicelyJim_Knicely Administrator

    It is the column name in the sample tables I used (T1 and T2).

  • So my query is as follows - can you show me how to add in the explode (string to array) which will create another field at the end based on the tables I gave you?

    SELECT *
    from T2

  • Jim_KnicelyJim_Knicely Administrator
    edited June 2021

    You haven't told us the Vertica version yet :)

    First check if you have the EXPLODE function by running \df in vsql:

    verticademos=> \df explode
                             List of functions
     procedure_name | procedure_return_type | procedure_argument_types
    ----------------+-----------------------+--------------------------
     Explode        | Any                   | Any
    (1 row)
    
    

    Or you could run this SQL:

    verticademos=> SELECT function_name FROM user_functions WHERE function_name = 'Explode';
     function_name
    ---------------
     Explode
    (1 row)
    
  • slc1axjslc1axj
    edited June 2021

    Version: Vertica Analytic Database v10.1.0-2 - Yes, the Explode function is there

  • DaveTDaveT Employee

    The query I provided will work but you'd need to change the STRING_TO_ARRAY syntax until you get to 10.1.1.

    It will be one of these two (can't recall which version started using different syntax but I gave an example for 10.1.1 which is better and clearer):
    string_to_array('['||t2.c||']',',')
    or
    string_to_array('['||t2.c||']' using parameters collection_delimiter=',')

  • The string_to_array function was not found? SELECT function_name FROM user_functions WHERE function_name = 'string_to_array';

  • Jim_KnicelyJim_Knicely Administrator

    You won't see it in USER_FUNCTIONS.

    Try \df

    verticademos=> \df string_to_array
                             List of functions
     procedure_name  | procedure_return_type | procedure_argument_types
    -----------------+-----------------------+--------------------------
     string_to_array | VarcharArray1D        | Varchar
     string_to_array | VarcharArray1D        | Varchar, Varchar
    (2 rows)
    
  • \df string_to_array SQL Error [4856] [42601]: [Vertica]VJDBC ERROR: Syntax error at or near "\"

  • Jim_KnicelyJim_Knicely Administrator

    You have to run \df in vsql!

  • Jim_KnicelyJim_Knicely Administrator

    Or just try this:

    SELECT string_to_array('[' || 'Vertica' || ']');
    
  • SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function string_to_array(varchar) does not exist, or permission is denied for string_to_array(varchar)

  • slc1axjslc1axj
    edited June 2021

    ok tried this query:

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from wfmgmt_prd.open_report o
    left join WFMGMT_PRD.map_hcpc_lifesustain l on contains(string_to_array('['||o.'service code'||']' using parameters collection_delimiter=','))
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%'

    Getting error: SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function string_to_array(varchar) does not exist, or permission is denied for string_to_array(varchar)

    WFMGMT_PRD.map_hcpc_lifesustain = T1
    wfmgmt_prd.open_report = T2

  • Jim_KnicelyJim_Knicely Administrator

    Try:

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from wfmgmt_prd.open_report o
    left join WFMGMT_PRD.map_hcpc_lifesustain l on contains(string_to_array('['||o.'service code'||']', ','))
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%'

  • SQL Error [3457] [42883]: [Vertica]VJDBC ERROR: Function contains(array[varchar]) does not exist, or permission is denied for contains(array[varchar])

  • Tried this one too but have error: SQL Error [4856] [42601]: [Vertica]VJDBC ERROR: Syntax error at or near "from"

    SELECT o.'service code',decode(l.SERVICECODE,null,'N','Y')"Match"
    from wfmgmt_prd.open_report o
    left join WFMGMT_PRD.map_hcpc_lifesustain l on contains(StringTokenizerDelim(o.'service code',',') over (partition by o.'intake id') from wfmgmt_prd.open_report)
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%';

  • Found out this will break up the service codes in T2 but how do I write this into the query to code the Match as a 'Y'?

    SELECT v_txtindex.StringTokenizerDelim(o.'service code',',') over()
    from WFMGMT_PRD.open_report o
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%';

  • Jim_KnicelyJim_Knicely Administrator

    I have no idea why your STRING_TO_ARRAY function isn't working.

    Anyway, you can use StringTokenizerDelim like the Explode function...

    Example:

    verticademos=> SELECT * FROM t1;
       c
    --------
     678910
     123456
     1
    (3 rows)
    
    verticademos=> SELECT * FROM t2;
               c
    -----------------------
     12345,678910,11121314
     123456,522122,345644
     2,3,4
    (3 rows)
    
    verticademos=> SELECT foo.c, NVL2(t1.c, 'Y', 'N') "Match"
    verticademos->   FROM (SELECT c, v_txtindex.StringTokenizerDelim(c, ',') OVER (PARTITION BY c)
    verticademos(>           FROM t2) foo
    verticademos->   LEFT JOIN t1
    verticademos->     ON foo.words = t1.c
    verticademos->  LIMIT 1 OVER (PARTITION BY foo.c ORDER BY foo.c, NVL2(t1.c, 'Y', 'N') DESC);
               c           | Match
    -----------------------+-------
     12345,678910,11121314 | Y
     123456,522122,345644  | Y
     2,3,4                 | N
    (3 rows)
    

    Send me the DDL for each of your tables and I can rewrite the above for you.

    To get the DDL you need to run these two commands:

    SELECT export_objects('','wfmgmt_prd.open_report');
    SELECT export_objects('','WFMGMT_PRD.map_hcpc_lifesustain');

    Post the results here or send them as a message directly to me.

  • SQL Error [5297] [0A000]: [Vertica]VJDBC ERROR: Unsupported use of LIMIT/OFFSET clause when running export_objects

    I can tell you that Table 1 = map_hcpc_lifesustain (Varchar(20) field called 'SERVICECODE') - this is a list with single service codes
    Table 2 = open_report (Varchar(500)field called 'service code') - this has multiple service codes with comma between

    I need a select query to add a field when I select all from T2(open_report) that says whether the 'service code' matches to one of the 'SERVICECODE' IN T1(map_hcpc_lifesustain

  • This seems to be working but it is grouping the 'service code' from Table 2 - I need to show all the rows from table 2 with the Match at the end.

    SELECT foo.'service code', NVL2(l.SERVICECODE, 'Y', 'N') "Match"
    FROM (SELECT o.'service code', v_txtindex.StringTokenizerDelim(o.'service code', ',') OVER (PARTITION BY o.'service code')
    FROM wfmgmt_prd.open_report o
    where o.'Report Date' = '2021-06-09' and o.'service code' like '%1628%') foo
    LEFT JOIN WFMGMT_PRD.map_hcpc_lifesustain l
    ON foo.words = l.SERVICECODE
    LIMIT 1 OVER (PARTITION BY foo.'service code' ORDER BY foo.'service code', NVL2(l.SERVICECODE, 'Y', 'N') DESC)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file