HP vertica SQL query to create new column
This is the table I have and I have 5 distinct programs. when a user like a program it reads as follows:
User Program
----------------
A 1
A 4
B 2
B 4
B 5
However I want to write a query that will allow me to also see the 5 distinct programs for each of my user and create a new column that will take two value (binary) 1 if the user liked a specific program and if not 0. Any help will be appreciated. Thanks
User Program NewColumn
A 1 1
A 2 0
A 3 0
A 4 1
A 5 0
B 1 0
B 2 1
B 3 0
B 4 1
B 5 1
0
Comments
Tricky. I needed a little challenge today.
Part of the problem is that UserProgram doesn't contain information about program #3, even though you're output has it. So, at the very least, I had to create a Programs table to store all the distinct programs. If that's not a problem in your data set, then you could replace my lookup to Programs with a "select distinct program from UserProgram".
Then it's a matter of cartesianing the results together. But my original attempt here didn't work, since whatever base table I had had a tendency to drive the result. So, I puleld away, and use a dual cartesian from dual. That solved it.
Solution:
dbadmin=> select * From userprogram ;
user | program
------+---------
A | 4
A | 1
B | 4
B | 5
B | 2
(5 rows)
dbadmin=> select * from programs ;
program | description
---------+-------------
1 | program 1
3 | program 3
5 | program 5
2 | program 2
4 | program 4
(5 rows)
WITH cartesian AS
(select users."user", programs.program
from dual
join (select distinct "user" from userprogram) as users on 1=1
join (select program From programs) as programs on 1=1 )
select c."user", c.program
, case when c.program=up.program then 1 else 0 end
from cartesian c
left join userprogram up on c."user" = up."user" and c.program=up.program
order by 1,2 ;
user | program | case
------+---------+------
A | 1 | 1
A | 2 | 0
A | 3 | 0
A | 4 | 1
A | 5 | 0
B | 1 | 0
B | 2 | 1
B | 3 | 0
B | 4 | 1
B | 5 | 1
(10 rows)
I do have some concerns about how well such a thing might scale. But so long as the projection on UserProgram is ordered by User,Program it should be fairly efficient. With Clause Materialization might also help the performance here, if need be.
Thank you for the reply !!! This works out beautifully