Vertica SQL Data Transformation

Dear Friends,

My source table has a table with 2 columns: id, items

ID        ITEMS
------------------------------
1         X1, X2, X3, ... , Xn

2         Y1, Y2

And I would like to get the following output.

ID     COMBINATION
------------------
1      X1
1      X2
1      X3
...
...
...
1      Xn
2      Y1
2      Y2


Please help. Thank you.

Regards,
Kushal

Comments

  • You can use Vertica's tokenize UDF for implementing this.

    /opt/vertica/sdk/examples/TransformFunctions/StringTokenizer.cpp
  • Sid,
    I have not worked on UDF before. Can't this be done using SQLs

    Regards,
    Kushal
  • Sid,
    I have not worked on UDF before. Can't this be done using SQLs

    Regards,
    Kushal
  • The most simple method is to use UNION , each query will substr the next string from the line 
  • There is also a String Tokenizer in the Vertica Marketplace:

    https://vertica.hpwsportal.com/#/Product/Search/{%22qtext%22%3A%22string%22}/Show

    You can install it as a Vertica add-on.  Once installed, it is accessible from SQL.  (The Marketplace package is based on the "strings_package" code on our GitHub site, in case anyone's curious where it comes from.)

    This function is built into Vertica 7.1+ as "TxtIndex.StringTokenizer()".
  • Hi Kushal,

    If you find yourself working with data like this a lot, I would encourage you to take a look at Vertica's Flex extensions:

    http://my.vertica.com/docs/7.1.x/PDF/HP_Vertica_7.1.x_Flex_Tables.pdf

    In particular, you can parse the comma-separated list into an array using either the "FDelimitedParser()" function (at data-load time, see page 79 in the above doc) or the "MapDelimitedExtractor()" function (after the data has been loaded, see page 67); then use the "MapValues()" function (page 109) to explode the array out to one value per row.

    I think "MapDelimitedExtractor()" might be new in Vertica 7.1?  The others all ship with 7.0.

    (Note that these functions are often used with Vertica Flex Tables, but they don't have to be; you can use them anywhere.)

    What you have here is data that's not initially in traditional row/column form.  Vertica's Flex extensions are (among other things) intended to help you easily get it into a form from which you can manipulate it using more-traditional SQL.

    Adam
  • Hi!

    I don't remember where (not here), but somewhere I posted UNPIVOT function (UDFTx C++).
  • Thanks everyone for your quick response. and Thanks Adam..i will check and let everyone know how this is implemented

    Regards,
    Kushal

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file