Selecting a wildcard of virtual columns from a flex table

robannrobann Vertica Customer

Suppose I have imported Avro/JSON data into a flex table and the data is flattened. There are many 'keys' in the raw column that are available as virtual columns. As a convoluted example lets assume some of the keys are:

person.salary
person.name.firstname
person.name.lastname
person.name.nickname
person.name.title
person.favourite.colour

What I would like to do is to select only and all of the data with keys like person.name.*

i.e. return data like:

person.name.firstname person.name.lastname person.name.nickname person.name.title
bob smith bigguy mr
jane jones ace mrs

I know I can do this by explicitly listing all the virtual columns, but let's assume I don't know all of the keys that could have been added under 'name'.

Is there any way to execute this and return this subset of the data? I'm hoping to wildcard/like the key name.

Thanks,
Robert.

Tagged:

Comments

  • Bryan_HBryan_H Vertica Employee Administrator

    Consider the following flex data:
    d2=> select maptostring(raw) from rest;

    maptostring

    {
    "restaurant.name": "Bob's pizzeria",
    "restaurant.cuisine": "Italian",
    "restaurant.location.city": "Cambridge",
    "restaurant.location.zip": "02140",
    "restaurant.menu": {
    "0.item": "cheese pizza",
    "0.price": "$8.25",
    "1.item": "chicken pizza",
    "1.price": "$11.99",
    "2.item": "spinach pizza",
    "2.price": "$10.50"
    }
    }
    {
    "restaurant.name": "Bill's pizzeria",
    "restaurant.cuisine": "Italian",
    "restaurant.location.city": "Somerville",
    "restaurant.location.zip": "02141",
    "restaurant.menu": {
    "0.item": "cheese pizza",
    "0.price": "$8.25",
    "1.item": "chicken pizza",
    "1.price": "$11.99",
    "2.item": "spinach pizza",
    "2.price": "$10.50"
    }
    }
    {
    "restaurant.name": "Bart's pizzeria",
    "restaurant.cuisine": "Italian",
    "restaurant.location.city": "Newton",
    "restaurant.location.zip": "02142",
    "restaurant.menu": {
    "0.item": "cheese pizza",
    "0.price": "$8.25",
    "1.item": "chicken pizza",
    "1.price": "$11.99",
    "2.item": "spinach pizza",
    "2.price": "$10.50"
    }
    }
    (3 rows)

    I can use MAPITEMS to extract the flattened restaurant locations with a wildcard as follows:
    d2=> select * from (select mapitems(raw) over () from rest) r1 where keys::varchar like 'restaurant.location%';
    keys | values
    --------------------------+------------
    restaurant.location.city | Cambridge
    restaurant.location.zip | 02140
    restaurant.location.city | Somerville
    restaurant.location.zip | 02141
    restaurant.location.city | Newton
    restaurant.location.zip | 02142
    (6 rows)

  • robannrobann Vertica Customer

    The problem is that every key and value is on their own row. In your example, which of the zip codes is the one for Barts? For the data to be useful you need all the data items on an individual row. i.e.

    | Bart's pizzeria | Newton | 02142 |
    | Bill's pizzeria | Somerville | 02141 |

    Is there anyway to combine the mapitems data with the row data to collapse into an individual row? Without having to know ( and hard-code) all the virtual column names?

    Thanks,
    Robert.

  • Bryan_HBryan_H Vertica Employee Administrator

    You can group the results and IMPLODE into a single row, or use array functions to collapse the list to an array.

  • robannrobann Vertica Customer

    Thanks for the help, I think I'll have to change the way the application retrieves the data but I should be able to make this work.

    Thanks,
    Robert

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file