JSON - MapJSONExtractor problem with start_point
Guys,
It looks like there is an issue related to MapJSONExtractor with using start_point flag.
Im trying to query simple json array, by using previously mentioned flat.
My table with json contains two rows:
id | json
----+----------------------------------------------------------------------------
2 | {"key":"test_value","nationality":[{"key":"UK","value":"United Kingdom"}]}
1 | {"key":"test_value","nationality":[{"key":"AT","value":"Austria"}]}
Im trying to retireve nationality.value to get country name:
default=> select id,MAPLOOKUP(MapJSONExtractor(json using parameters >start_point='nationality'),'value') from km_test;
id | MAPLOOKUP
----+-----------
1 | Austria
2 |
(2 rows)
It looks like that the function MapJSONExtractor works fine with first row.
Both json entries are pretty equal, so i believe that MapJSONExtractor should gives the same results, but it is not....
default=> \x
Expanded display is on.
default=> select id,MAPLOOKUP(MapJSONExtractor(json using parameters >start_point='nationality'),'value'),MapJSONExtractor(json using parameters start_point='nationality') from >km_test;
-[ RECORD 1 ]----+------------------------------------------------------------------------------------------------------->---------------------------------------------------------------------------------------------------------------------------->-------------------------------------------------------------------------------
id | 1
MAPLOOKUP | Austria
MapJSONExtractor | >\001\000\000\000\025\000\000\000\002\000\000\000\014\000\000\000\016\000\000\000ATAustria\002>\000\000\000\014\000\000\000\017\000\000\000keyvalue
-[ RECORD 2 ]----+------------------------------------------------------------------------------------------------------->---------------------------------------------------------------------------------------------------------------------------->-------------------------------------------------------------------------------
id | 2
MAPLOOKUP |
MapJSONExtractor | >\001\000\000\000R\000\000\000\002\000\000\000\014\000\000\000\026\000\000\000test_value\001\0>00\000\000\034\000\000\000\002\000\000\000\014\000\000\000\016\000\000\000UKUnited >Kingdom\002\000\000\000\014\000\000\000\021\000\000\0000.key0.value\002\000\000\000\014\000\0>00\000\017\000\000\000keynationality
I used sorting option, to see whats happend and as i already mentioned, works only on first row
default=> select id,MAPLOOKUP(MapJSONExtractor(json using parameters >start_point='nationality'),'value') from km_test order by 1 asc;
id | MAPLOOKUP
----+-----------
1 | Austria
2 |
(2 rows)default=> select id,MAPLOOKUP(MapJSONExtractor(json using parameters >start_point='nationality'),'value') from km_test order by 1 desc;
id | MAPLOOKUP
----+----------------
2 | United Kingdom
1 |
(2 rows)
I performed this test on two different vertica versions:
dbadmin=> select version();
version
Vertica Analytic Database v9.3.1-11
(1 row)
dbadmin=>
default=> select version();version
Vertica Analytic Database v9.2.0-2
(1 row)
I apprieciate your help!
Answers
Hi, I am able to reproduce this even with Vertica 10.1.0-0. We've logged a case with engineering to investigate. If you are able, I recommend also filing a case with Vertica support to ensure you are notified of any fix or workaround.
A possible workaround is to load the JSON using FJSONPARSER, which works as expected here for flex table "jsonstartf":
dbadmin=> copy jsonstartf from local stdin parser fjsonparser(start_point='nationality',start_point_occurrence=1);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"key":"test_value","nationality":[{"key":"AT","value":"Austria"}]}
>> {"key":"test_value","nationality":[{"key":"UK","value":"United Kingdom"}]}
>> {"key":"test_value","nationality":[{"key":"CA","value":"Canada"}]}
>> .
Rows Loaded
(1 row)
Time: First fetch (1 row): 21020.874 ms. All rows formatted: 21020.966 ms
dbadmin=> select maptostring(raw) from jsonstartf;
maptostring
{
"key": "AT",
"value": "Austria"
}
{
"key": "UK",
"value": "United Kingdom"
}
{
"key": "CA",
"value": "Canada"
}
(3 rows)
Time: First fetch (3 rows): 248.649 ms. All rows formatted: 248.845 ms
Thank you Bryan for your answer.
During the day I found a quick workaround which looks pretty the same as yours.