Need a match function for Vertica SQL

slc1axj · June 2021

ok figured that out by removing the last line of code above but still looking how to bring in other columns from T2

slc1axj · June 2021

Service Code #1628 is in T1; however, it's flagging it as a N - so maybe this isn't working...

1628,1640 Y
1628,1640 N
1628 Y
1628 Y
1628 Y
1628 Y
1628 Y
1628 Y
1628 Y
1628,1629,1640 Y
1628,1629,1640 N
1628,1629,1640 N

Jim_Knicely · June 2021

You need the LIMIT to eliminate the rows:

1628,1640 N
1628,1629,1640 N
1628,1629,1640 N

slc1axj · June 2021

ok - my neck is now stinging after 2 days of trying to get this to work but I can't do the Like or if like or if like 280 times. Can you help me write the query that will bring in all records from T2 with the match column at the end that tells me whether or not there is a 'service code' within that field that matches to a SERVICECODE in T1? I need all rows pulled from T2.

Jim_Knicely · June 2021

How discrete are the codes?

For example, will there be a chance where there is this sequence of codes:

123,456,789

And in the table that has the code list there is a code 45?

If not, I was speaking to a colleague who suggesteds something like this:

verticademos=> SELECT * FROM t1;
     c
-----------
 678910
 123456
 111222333
(3 rows)

verticademos=> SELECT * FROM t2;
             c
----------------------------
 12345,678910,11121314
 123456,522122,345644
 127361323,12123119,2873233
(3 rows)

verticademos=> SELECT t2.c, MAX(CASE WHEN INSTR(t2.c, t1.c) >= 1 THEN 'Y' ELSE 'N' END) "Match" FROM t2 CROSS JOIN t1 GROUP BY 1 ORDER BY 1;
             c              | Match
----------------------------+-------
 12345,678910,11121314      | Y
 123456,522122,345644       | Y
 127361323,12123119,2873233 | N
(3 rows)

But if your codes are NOT discrete enough, you could get false positives.

Example:

verticademos=> INSERT INTO t1 SELECT 1;
 OUTPUT
--------
      1
(1 row)

verticademos=> SELECT t2.c, MAX(CASE WHEN INSTR(t2.c, t1.c) >= 1 THEN 'Y' ELSE 'N' END) "Match" FROM t2 CROSS JOIN t1 GROUP BY 1 ORDER BY 1;
             c              | Match
----------------------------+-------
 12345,678910,11121314      | Y
 123456,522122,345644       | Y
 127361323,12123119,2873233 | Y
(3 rows)

That last one is now incorrect!

slc1axj · June 2021

the codes from T1 are mutually exclusive - codes in both tables are 4 characters long. The only difference is T2 contains multiple codes divided by a comma and T1 has just a list of codes with only 1 code per row.

slc1axj · June 2021

By re-writing your query, the results that come up show mutually exclusive service codes on the left with a Y/N on right but I need the query to pull all columns from T2

Jim_Knicely · June 2021

Table T2 is just my example. It has one column. If yours has more, just do this:

SELECT t2.*, MAX(CASE WHEN INSTR(t2.c, t1.c) >= 1 THEN 'Y' ELSE 'N' END) "Match" FROM t2 CROSS JOIN t1 GROUP BY 1 ORDER BY 1;

slc1axj · June 2021

SQL Error [2640] [42803]: [Vertica]VJDBC ERROR: Column "o.plan pick" must appear in the GROUP BY clause or be used in an aggregate function

slc1axj · June 2021

it may want all my column names in the Group by?

Jim_Knicely · June 2021

Sorry!

SELECT t2.*, MAX(CASE WHEN INSTR(t2.c, t1.c) >= 1 THEN 'Y' ELSE 'N' END) "Match" FROM t2 CROSS JOIN t1 GROUP BY col1, col2, col3, col4, col5, etc ORDER BY col1, col2, col3, col4, col5, etc;

List all of the columns!

slc1axj · June 2021

We may have got it - validating:)

slc1axj · June 2021

WE GOT IT - THANKYOU!!

slc1axj · July 2021

So now I'm having to put 2 crossjoin queries into the query. When you run them separately, they are both pretty fast. However, when these are both in the same query, they are taking a tremendous amount of time (over 35 minutes) in the large query. The quick queries are shown below:

--Life Sustaining
select o.'intake id',
o.'service code',
o.'service category',
MAX(CASE WHEN INSTR(o.'service code', l.SERVICECODE) >= 1 THEN 'Y' ELSE 'N' END) "Life Sustaining"
from wfmgmt_prd.open_report_hourly o
CROSS JOIN WFMGMT_PRD.map_hcpc_lifesustain l
where "Report Interval" = '2021-06-25 12:00:00'
and o."queue type" = 'Provider Staffing'
Group by o.'intake id',o.'service code',o.'service category';

--Auto-Provider
select o.'intake id',
o.'provider hcpc/revenue code',
o.'zip code',
MAX(CASE WHEN INSTR(o.'provider hcpc/revenue code', a.HCPC) >= 1 and o.'zip code'= a.zip THEN a.provider ELSE 'N' END) "Auto-Provider"
from wfmgmt_prd.open_report_hourly o
CROSS JOIN WFMGMT_PRD.map_auto_provider a
where "Report Interval" = '2021-06-25 12:00:00'
and o."queue type" = 'Provider Staffing'
Group by o.'intake id',o.'provider hcpc/revenue code',o.'zip code';

--Combined
select o.'intake id',
o.'provider hcpc/revenue code',
o.'zip code',
o.'service code',
o.'service category',
MAX(CASE WHEN INSTR(o.'service code', l.SERVICECODE) >= 1 THEN 'Y' ELSE 'N' END) "Life Sustaining",
MAX(CASE WHEN INSTR(o.'provider hcpc/revenue code', a.HCPC) >= 1 and o.'zip code'= a.zip THEN a.provider ELSE 'N' END) "Auto-Provider"
from wfmgmt_prd.open_report_hourly o
CROSS JOIN WFMGMT_PRD.map_hcpc_lifesustain l
CROSS JOIN WFMGMT_PRD.map_auto_provider a
where "Report Interval" = '2021-06-25 12:00:00'
and o."queue type" = 'Provider Staffing'
Group by o.'intake id',o.'provider hcpc/revenue code',o.'zip code',o.'service code',o.'service category';

Need a match function for Vertica SQL

Answers

Leave a Comment