How to extract elements in repeated field into separate records (rows)?
Hi all!
I have JSON flex table with nested records as array.
Like this:
{
"ID":"12345",
"Update":{
"Time":"20150227",
"Type":"input",
"Record":[
{
"PartID":"4",
"PartType":"0"
},
{
"PartID":"7",
"PartType":"1"
}
]
}
}
I need to transform it in table like this:
ID | PartID | PartType
-----+--------+---------
12345 4 0
12345 7 1
In Apache Drill it could easily be done through FLATTEN function.
See https://cwiki.apache.org/confluence/display/DRILL/FLATTEN+Function for details.
Is that true that Vertica does not have such a great possibility? Or this could be done some other way?
-1
Comments
I've just started to explore the Vertica JSON functionality, but you will most likely need a script to format it in your desired output. You can specify to flatten_arrays=true in the fjsonparser to flatten the submaps:
The closest I could get is:
HTH
Hi, norbert-krupa.
Thanks for your reply.
Unfortunately that is definitely what we try to avoid. We need to make selects based on Record fields PartID and PartType. Actually to aggregate them and count - how many pairs of "PartID - PartType" there is in the table. And while Record array splits to separate columns this could not be achived..
Typically we have hundreds of 'Record' in 'Update' array. So, in that case Vertica should create hundreds columns and that is not so useful.
I'm going through a similar exercise in which I'm trying to normalize many nested arrays. If I'm successful I'll make sure to share my findings.