pivot or get dummies ?
Steeeev
Vertica Customer ✭
I am trying to get from
ID Product
A hat
A gloves
B shoes
To
ID hat gloves shoes
A 1 1 0
B 0 0 1
I have
select ID,
MIN(DECODE(TF_NAME, 'hat', '1', '0')) AS 'hat',
MIN(DECODE(TF_NAME, 'gloves', '1', '0')) AS 'gloves',
MIN(DECODE(TF_NAME, 'shoes', '1', '0')) AS 'shoes'
But all the results are 0
Can someone tell me what I did wrong?
0
Answers
What is TF_NAME? And it's also not clear to me what you would need a MIN() around a DECODE()? Try removing the MIN().
Can you share a snippet of the actual table data, and an example output of what you want it to look like?
Hi Steeev,
You can do a manual get_dummies the way you're doing: using decode statements. But in this case the "MIN" are useless as you are not aggregating.
Pivot and One Hot Encoder are things that are different.
In VerticaPy, you can directly get the dummies of your table using the vDataFrame:
https://www.vertica.com/python/documentation_last/vdataframe/vcolumn-methods/get_dummies/index.php
It will generate all the decode statement behind the scene. Otherwise you can achieve the same thing using ONE_HOT_ENCODER function:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AnalyzingData/MachineLearning/DataPreparation/EncodingCategoricalColumns.htm
Vertica has no Pivot function but you can generate the SQL statement using VerticaPy:
https://www.vertica.com/python/documentation_last/vdataframe/main-methods/pivot/
Hope it helps!
@Vertica_Curtis Sorry can't share actual data but you pointed me in the right direction. I have repeating IDs so I need to aggregate but with MAX so that where there is a 1 it won't get wiped out by a zero.
MAX(DECODE(Product, 'shoes', '1', '0')) AS 'shoes'
@badrouali Thanks for the tips. Just need to get the company to enable those features!