How to make JSON-like result merging ROW() structs into ARRAYs using GROUP BY ?

Suppose I have:
SELECT * FROM t ORDER BY id, tag;
id | tag | value
----+-----+-------
1 | a | 0.1
1 | b | 0.2
2 | a | 0.15
2 | b | 0.25
2 | c | 0.35
2 | d | 0.45
(6 rows)

I want to generate something JSON-like/compatible:

id result
---+-------------------------------------
1 | [{"a":0.1},{"b":0.2}]
2 | [{"a":0.15},{"b":0.25},{"c":0.35},{"d":0.45}]

I can achieve some things that are close:

WITH r AS (
SELECT id, tag || ':' || VALUE::VARCHAR kvp FROM t
)
SELECT id, IMPLODE(kvp) WITHIN GROUP (ORDER BY id, kvp)
FROM r GROUP BY 1 ORDER BY 1;
id | IMPLODE
----+---------------------------------------
1 | ["a:0.1","b:0.2"]
2 | ["a:0.15","b:0.25","c:0.35","d:0.45"]
(2 rows)

SELECT id, STRING_TO_ARRAY(
LISTAGG(tag || '|' || CAST(value AS VARCHAR) USING PARAMETERS separator='|') WITHIN GROUP (ORDER BY tag)
USING PARAMETERS collection_delimiter = '|')
FROM t
GROUP BY 1 ORDER BY 1;
id | STRING_TO_ARRAY
----+-----------------------------------------------
1 | ["a","0.1","b","0.2"]
2 | ["a","0.15","b","0.25","c","0.35","d","0.45"]
(2 rows)

But I can't find a way to create structs, then merge them into an ARRAY.

SELECT id, (tag, value)::ROW(VARCHAR,VARCHAR) FROM t ORDER BY id, tag;
id | row
----+----------------------------
1 | {"tag":"a","value":"0.1"}
1 | {"tag":"b","value":"0.2"}
2 | {"tag":"a","value":"0.15"}
2 | {"tag":"b","value":"0.25"}
2 | {"tag":"c","value":"0.35"}
2 | {"tag":"d","value":"0.45"}
(6 rows)

SELECT id, LISTAGG((tag, value)::ROW(VARCHAR,VARCHAR)) FROM t GROUP BY id ORDER BY id, tag;
ERROR 3457: Function LISTAGG(record) does not exist, or permission is denied for LISTAGG(record)
HINT: No function matches the given name and argument types. You may need to add explicit type casts

SELECT id, LISTAGG(CAST((tag, value)::ROW(VARCHAR,VARCHAR) AS VARCHAR))
FROM t GROUP BY id ORDER BY id;
ERROR 2366: Cannot cast type ROW(varchar(80),varchar(80)) to varchar

SELECT id, IMPLODE((tag, value)::ROW(VARCHAR,VARCHAR))
FROM t GROUP BY id ORDER BY id;
ERROR 9824: Implode is not supported for arguments of struct types

Perhaps this can't be done, since if you use AS to name fields in an array of ROW elements,
and the fields differ across the elements, Vertica forces the right-most name on you.

SELECT ARRAY[ROW('.1' AS a), ROW('.2' AS b)]
UNION ALL
SELECT ARRAY[ROW('.15' AS a), ROW('.25' AS b), ROW('.35' AS c), ROW('.45' AS d)];

array

[{"d":".1"},{"d":".2"}]
[{"d":".15"},{"d":".25"},{"d":".35"},{"d":".45"}]
(2 rows)

Seems like it should be possible.

Answers

  • SergeBSergeB - Select Field - Employee

    Maybe something like this?

    select id, implode(TO_JSON(ROW(tag,value)) as pairs
    from t
    group by id
    order by id;

  • That does work if you want "true" (encoded) JSON back, which may actually be the right thing for clients consuming the result.

  • moshegmosheg Vertica Employee Administrator

    In addition consider this:

    WITH formatted_values AS (
      SELECT
        id,
        CAST('{' || '"' || tag || '":' || value || '}' AS VARCHAR) AS json_obj,
        ROW_NUMBER() OVER (PARTITION BY id ORDER BY tag) as rn
      FROM t
    )
    SELECT
      id,
      '[' ||
      MAX(CASE WHEN rn = 1 THEN json_obj END) ||
      COALESCE(',' || MAX(CASE WHEN rn = 2 THEN json_obj END), '') ||
      COALESCE(',' || MAX(CASE WHEN rn = 3 THEN json_obj END), '') ||
      COALESCE(',' || MAX(CASE WHEN rn = 4 THEN json_obj END), '') ||
      ']' AS result
    FROM formatted_values
    GROUP BY id
    ORDER BY id;
     id |                    result
    ----+-----------------------------------------------
      1 | [{"a":0.1},{"b":0.2}]
      2 | [{"a":0.15},{"b":0.25},{"c":0.35},{"d":0.45}]
    (2 rows)
    
    SELECT id,
           IMPLODE(TO_JSON(ROW(tag,value)))
    FROM t
    GROUP BY id
    ORDER BY id;
     id |                                                                IMPLODE
    ----+---------------------------------------------------------------------------------------------------------------------------------------
      1 | ["{\"tag\":\"a\",\"value\":0.1}","{\"tag\":\"b\",\"value\":0.2}"]
      2 | ["{\"tag\":\"a\",\"value\":0.15}","{\"tag\":\"b\",\"value\":0.25}","{\"tag\":\"c\",\"value\":0.35}","{\"tag\":\"d\",\"value\":0.45}"]
    (2 rows)
    
This discussion has been closed.