Selecting a wildcard of virtual columns from a flex table

robann · April 2023

Suppose I have imported Avro/JSON data into a flex table and the data is flattened. There are many 'keys' in the raw column that are available as virtual columns. As a convoluted example lets assume some of the keys are:

person.salary
person.name.firstname
person.name.lastname
person.name.nickname
person.name.title
person.favourite.colour

What I would like to do is to select only and all of the data with keys like person.name.*

i.e. return data like:

person.name.firstname	person.name.lastname	person.name.nickname	person.name.title
bob	smith	bigguy	mr
jane	jones	ace	mrs

I know I can do this by explicitly listing all the virtual columns, but let's assume I don't know all of the keys that could have been added under 'name'.

Is there any way to execute this and return this subset of the data? I'm hoping to wildcard/like the key name.

Thanks,
Robert.

Bryan_H · April 2023

Consider the following flex data:
d2=> select maptostring(raw) from rest;

maptostring

{
"restaurant.name": "Bob's pizzeria",
"restaurant.cuisine": "Italian",
"restaurant.location.city": "Cambridge",
"restaurant.location.zip": "02140",
"restaurant.menu": {
"0.item": "cheese pizza",
"0.price": "$8.25",
"1.item": "chicken pizza",
"1.price": "$11.99",
"2.item": "spinach pizza",
"2.price": "$10.50"
}
}
{
"restaurant.name": "Bill's pizzeria",
"restaurant.cuisine": "Italian",
"restaurant.location.city": "Somerville",
"restaurant.location.zip": "02141",
"restaurant.menu": {
"0.item": "cheese pizza",
"0.price": "$8.25",
"1.item": "chicken pizza",
"1.price": "$11.99",
"2.item": "spinach pizza",
"2.price": "$10.50"
}
}
{
"restaurant.name": "Bart's pizzeria",
"restaurant.cuisine": "Italian",
"restaurant.location.city": "Newton",
"restaurant.location.zip": "02142",
"restaurant.menu": {
"0.item": "cheese pizza",
"0.price": "$8.25",
"1.item": "chicken pizza",
"1.price": "$11.99",
"2.item": "spinach pizza",
"2.price": "$10.50"
}
}
(3 rows)

I can use MAPITEMS to extract the flattened restaurant locations with a wildcard as follows:
d2=> select * from (select mapitems(raw) over () from rest) r1 where keys::varchar like 'restaurant.location%';
keys | values
--------------------------+------------
restaurant.location.city | Cambridge
restaurant.location.zip | 02140
restaurant.location.city | Somerville
restaurant.location.zip | 02141
restaurant.location.city | Newton
restaurant.location.zip | 02142
(6 rows)

robann · April 2023

The problem is that every key and value is on their own row. In your example, which of the zip codes is the one for Barts? For the data to be useful you need all the data items on an individual row. i.e.

Is there anyway to combine the mapitems data with the row data to collapse into an individual row? Without having to know ( and hard-code) all the virtual column names?

Thanks,
Robert.

Bryan_H · April 2023

You can group the results and IMPLODE into a single row, or use array functions to collapse the list to an array.

robann · April 2023

Thanks for the help, I think I'll have to change the way the application retrieves the data but I should be able to make this work.

Thanks,
Robert

We're Moving!

Create My New Community Account Now

Selecting a wildcard of virtual columns from a flex table

Comments

maptostring

Leave a Comment