Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

MAPTOSTRING turns JSON arrays into objects keyed by array index and erases types

Hey I noticed that the MAPTOSTRING built-in produces JSON that is not quite what one loaded in, even when using FJSONParser(flatten_maps=FALSE,flatten_arrays=FALSE). I leafed through some of the available source https://github.com/vertica/FlexTable/blob/master/src/JSONParser.cpp I'm guessing that's by design as all flex inputs are turned into the VMap internal representation which doesn't have a notion of "arrays" or "value types"

Here's a minimal example:

CREATE FLEX TABLE some_json();

my json:

{
  "stuff": [
    "a",
    "b",
    "c",
    "d"
  ],
  "id": 0
}
{
  "stuff": [
    "e",
    "f",
    "g",
    "h"
  ],
  "id": 1
}

loading it in:

$ echo '{"stuff": ["a", "b", "c", "d"], "id": 0}\n{"stuff": ["e", "f", "g", "h"], "id": 1}' | vsql ... -c "COPY some_json(__raw__) FROM STDIN PARSER public.FJSONParser(flatten_maps=FALSE,flatten_arrays=FALSE)"

what it looks like

$ vsql ... -c "SELECT public.MapToString(__raw__) FROM some_json"
                                  MapToString
-------------------------------------------------------------------------------
 {
        "id": "0",
        "stuff": {
                "0": "a",
                "1": "b",
                "2": "c",
                "3": "d"
        }
}
 {
        "id": "1",
        "stuff": {
                "0": "e",
                "1": "f",
                "2": "g",
                "3": "h"
        }
}
(2 rows)

Is there a way of keeping around the originally loaded data alongside the VMAP column?

Thanks!

Tagged:

Answers

  • You can keep a copy of input JSON objects in a separate column.

    create flex table f1();
    alter table f1 add column json long varchar;
    
    copy f1(json, __raw__ as MapJSONExtractor(json)) from stdin;
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    {"a":["b","c"]}
    \.
    
    select * from f1;
     __identity__ |                                                                                                              __raw__                                                                                                               |      json       
    --------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------
                1 | \001\000\000\000,\000\000\000\001\000\000\000\010\000\000\000\001\000\000\000\016\000\000\000\002\000\000\000\014\000\000\000\015\000\000\000bc\002\000\000\000\014\000\000\000\015\000\000\00001\001\000\000\000\010\000\000\000a | {"a":["b","c"]}
    (1 row)
    
    verticadb20140=> select json, maptostring(__raw__) from f1;
          json       |              maptostring              
    -----------------+---------------------------------------
     {"a":["b","c"]} | {
            "a": {
                    "0": "b",
                    "1": "c"
            }
    }
    (1 row)
    
  • Well. Can we have a new feature - MapToString should try to deduct if this is an array.
    If all keys in dict are sequential numbers starting from 0, it is a sign of array. MapToString should produce an array in this case.
    That would nicely solve a problem.
    This new feature of MapToString should be triggered by new parameter - do_my_best_to_figure_out_array. If set to true, new behaviour will be triggered.

    Please consider filing new feature request.

  • @Ariel_Cary can you create internal feature request, to make MapToString deduct array, when additional parameter specified?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.