Vertica SQL Data Transformation
Dear Friends,
My source table has a table with 2 columns: id, items
ID ITEMS
------------------------------
1 X1, X2, X3, ... , Xn
2 Y1, Y2
And I would like to get the following output.
ID COMBINATION
------------------
1 X1
1 X2
1 X3
...
...
...
1 Xn
2 Y1
2 Y2
Please help. Thank you.
Regards,
Kushal
My source table has a table with 2 columns: id, items
ID ITEMS
------------------------------
1 X1, X2, X3, ... , Xn
2 Y1, Y2
And I would like to get the following output.
ID COMBINATION
------------------
1 X1
1 X2
1 X3
...
...
...
1 Xn
2 Y1
2 Y2
Please help. Thank you.
Regards,
Kushal
0
Comments
/opt/vertica/sdk/examples/TransformFunctions/StringTokenizer.cpp
I have not worked on UDF before. Can't this be done using SQLs
Regards,
Kushal
I have not worked on UDF before. Can't this be done using SQLs
Regards,
Kushal
https://vertica.hpwsportal.com/#/Product/Search/{%22qtext%22%3A%22string%22}/Show
You can install it as a Vertica add-on. Once installed, it is accessible from SQL. (The Marketplace package is based on the "strings_package" code on our GitHub site, in case anyone's curious where it comes from.)
This function is built into Vertica 7.1+ as "TxtIndex.StringTokenizer()".
If you find yourself working with data like this a lot, I would encourage you to take a look at Vertica's Flex extensions:
http://my.vertica.com/docs/7.1.x/PDF/HP_Vertica_7.1.x_Flex_Tables.pdf
In particular, you can parse the comma-separated list into an array using either the "FDelimitedParser()" function (at data-load time, see page 79 in the above doc) or the "MapDelimitedExtractor()" function (after the data has been loaded, see page 67); then use the "MapValues()" function (page 109) to explode the array out to one value per row.
I think "MapDelimitedExtractor()" might be new in Vertica 7.1? The others all ship with 7.0.
(Note that these functions are often used with Vertica Flex Tables, but they don't have to be; you can use them anywhere.)
What you have here is data that's not initially in traditional row/column form. Vertica's Flex extensions are (among other things) intended to help you easily get it into a form from which you can manipulate it using more-traditional SQL.
Adam
I don't remember where (not here), but somewhere I posted UNPIVOT function (UDFTx C++).
Regards,
Kushal