How do I PIVOT in Vertica
Hi folks,
I have spend several days on this, and my boss is getting upset.
I am trying to PIVOT the following table in Vertica. I found several options online, but they all require that the Pivoted column has to be an INTEGER. I tried the DECODE function, but I need to Pivot on a string and not an INT.
With the example below, the final result set should be... ID (not pivoted),DESCRIPTION (Columns across), EMPLOYED (will be the values).
Any help would be greatly appreciated.
/BEGIN/
DROP TABLE IF EXISTS Pivot_TEST
CREATE TABLE Pivot_TEST
(
ID INT,
DESCRIPTION VARCHAR(25),
EMPLOYED VARCHAR(5)
);
INSERT INTO Pivot_TEST VALUES (1, 'Fulltime','YES');
INSERT INTO Pivot_TEST VALUES (2, 'Fulltime','YES');
INSERT INTO Pivot_TEST VALUES (3, 'Contract','YES');
INSERT INTO Pivot_TEST VALUES (4, 'Contract','YES');
INSERT INTO Pivot_TEST VALUES (5, 'Seasonal','YES');
INSERT INTO Pivot_TEST VALUES (6, 'Seasonal','YES');
INSERT INTO Pivot_TEST VALUES (7, 'Terminated','NO');
INSERT INTO Pivot_TEST VALUES (8, 'Suspended','NO');
INSERT INTO Pivot_TEST VALUES (9, 'Fulltime','YES');
INSERT INTO Pivot_TEST VALUES (10,'Fulltime','YES');
SELECT * From Pivot_TEST;
/* END */
Comments
If you just want to create column values using the actual values in the DESCRIPTION column you can write the SQL with DECODE or CASE and have one column for each distinct value of DESCRIPTION. If ID is unique you don't need any aggregate functions. If ID is not unique you don't have to use SUM; you can use MIN or MAX or anything that works with a string. It doesn't really matter what aggregate is used, the purpose in this case is just to bring the like ID values into a single row.
Of course, if you have conflicting information in different rows (such as both 1|Fulltime|YES and 1|Fulltime|NO) then then whether MIN or MAX does matter but which one is really correct?
Let me know if this is what you're looking for:
SELECT ID,
CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Fulltime', 'Yes', 'No')) ELSE 'No' END AS "Fulltime",
CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Contract', 'Yes', 'No')) ELSE 'No' END AS "Contract",
CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Seasonal', 'Yes', 'No')) ELSE 'No' END AS "Seasonal",
CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Terminated', 'Yes', 'No')) ELSE 'No' END AS "Terminated",
CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Suspended', 'Yes', 'No')) ELSE 'No' END AS "Suspended"
FROM Pivot_TEST GROUP BY ID, EMPLOYED order by ID;
Thank you very much for this, but I think I am over complicating this. Here is a smaller table that I just want to PIVOT.
create table test2
(
master_country varchar(25)
);
INSERT INTO test2 values ('French Polynesia');
INSERT INTO test2 values ('Colombia');
INSERT INTO test2 values ('Brazil');
INSERT INTO test2 values ('Mauritius');
INSERT INTO test2 values ('Faroe Islands');
select * from test2;
Similar to the suggestion by Shayon_Sanyal...
dbadmin=> select * from test2;
master_country
French Polynesia
Colombia
Brazil
Mauritius
Faroe Islands
(5 rows)
dbadmin=> select
dbadmin-> max(decode(rn, 1, master_country, null)) || ', ' ||
dbadmin-> max(decode(rn, 2, master_country, null)) || ', ' ||
dbadmin-> max(decode(rn, 3, master_country, null)) || ', ' ||
dbadmin-> max(decode(rn, 4, master_country, null)) || ', ' ||
dbadmin-> max(decode(rn, 5, master_country, null)) master_country
dbadmin-> from (select master_country, row_number() over () rn from test2) foo;
master_country
French Polynesia, Colombia, Brazil, Mauritius, Faroe Islands
(1 row)
Nice! What do you do, when the number of rows to pivot is not known? Is there something generic?
We will soon publish a few use cases for Stored Procedures, which will soon be part of a Vertica 11 subversion.
I actually have a pivot function that would do that for you in that collection.
But it's early days just now, and won't work in the current version.
The principle is this:
With this table
sensors
:You need this query:
You find that out by previously firing a:
.. and then generating as many
MAX(CASE msr WHEN '???' THEN val END) AS val_???
, as you get distinct values - and incorporate that into the final query.I have developed the sample UDx to transform the long-form data to the wide-form data. Please try to use it if you can.
https://github.com/h-serizawa/vertica-extension-functions/tree/master/long2wide
Now that Vertica 11.0.1 is out, and with that, support for forward cursors in Stored procedures - with the same table
sensors
as above, try this:Create the test table:
Then, with a stored procedure, you can pivot the table. Into a target (temporary) table that you can then select from:
The stored procedure is in prototype mode, and actually outputs the query it has generated, and it is going to fire.
And just for playing purposes: Here's the full source code for the stored procedure:
Happy playing ....