How do I PIVOT in Vertica

Hi folks,

I have spend several days on this, and my boss is getting upset. :(

I am trying to PIVOT the following table in Vertica. I found several options online, but they all require that the Pivoted column has to be an INTEGER. I tried the DECODE function, but I need to Pivot on a string and not an INT.

With the example below, the final result set should be... ID (not pivoted),DESCRIPTION (Columns across), EMPLOYED (will be the values).

Any help would be greatly appreciated.

/BEGIN/

DROP TABLE IF EXISTS Pivot_TEST

CREATE TABLE Pivot_TEST
(
ID INT,
DESCRIPTION VARCHAR(25),
EMPLOYED VARCHAR(5)
);

INSERT INTO Pivot_TEST VALUES (1, 'Fulltime','YES');
INSERT INTO Pivot_TEST VALUES (2, 'Fulltime','YES');
INSERT INTO Pivot_TEST VALUES (3, 'Contract','YES');
INSERT INTO Pivot_TEST VALUES (4, 'Contract','YES');
INSERT INTO Pivot_TEST VALUES (5, 'Seasonal','YES');
INSERT INTO Pivot_TEST VALUES (6, 'Seasonal','YES');
INSERT INTO Pivot_TEST VALUES (7, 'Terminated','NO');
INSERT INTO Pivot_TEST VALUES (8, 'Suspended','NO');
INSERT INTO Pivot_TEST VALUES (9, 'Fulltime','YES');
INSERT INTO Pivot_TEST VALUES (10,'Fulltime','YES');

SELECT * From Pivot_TEST;

/* END */

Comments

  • DaveTDaveT Employee

    If you just want to create column values using the actual values in the DESCRIPTION column you can write the SQL with DECODE or CASE and have one column for each distinct value of DESCRIPTION. If ID is unique you don't need any aggregate functions. If ID is not unique you don't have to use SUM; you can use MIN or MAX or anything that works with a string. It doesn't really matter what aggregate is used, the purpose in this case is just to bring the like ID values into a single row.

    Of course, if you have conflicting information in different rows (such as both 1|Fulltime|YES and 1|Fulltime|NO) then then whether MIN or MAX does matter but which one is really correct?

  • Let me know if this is what you're looking for:

    SELECT ID,
    CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Fulltime', 'Yes', 'No')) ELSE 'No' END AS "Fulltime",
    CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Contract', 'Yes', 'No')) ELSE 'No' END AS "Contract",
    CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Seasonal', 'Yes', 'No')) ELSE 'No' END AS "Seasonal",
    CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Terminated', 'Yes', 'No')) ELSE 'No' END AS "Terminated",
    CASE WHEN EMPLOYED = 'YES' THEN MIN(DECODE(DESCRIPTION, 'Suspended', 'Yes', 'No')) ELSE 'No' END AS "Suspended"
    FROM Pivot_TEST GROUP BY ID, EMPLOYED order by ID;

  • Thank you very much for this, but I think I am over complicating this. Here is a smaller table that I just want to PIVOT.

    create table test2
    (
    master_country varchar(25)
    );

    INSERT INTO test2 values ('French Polynesia');
    INSERT INTO test2 values ('Colombia');
    INSERT INTO test2 values ('Brazil');
    INSERT INTO test2 values ('Mauritius');
    INSERT INTO test2 values ('Faroe Islands');

    select * from test2;

  • Jim_KnicelyJim_Knicely Administrator

    Similar to the suggestion by Shayon_Sanyal...

    dbadmin=> select * from test2;

    master_country

    French Polynesia
    Colombia
    Brazil
    Mauritius
    Faroe Islands
    (5 rows)

    dbadmin=> select
    dbadmin-> max(decode(rn, 1, master_country, null)) || ', ' ||
    dbadmin-> max(decode(rn, 2, master_country, null)) || ', ' ||
    dbadmin-> max(decode(rn, 3, master_country, null)) || ', ' ||
    dbadmin-> max(decode(rn, 4, master_country, null)) || ', ' ||
    dbadmin-> max(decode(rn, 5, master_country, null)) master_country
    dbadmin-> from (select master_country, row_number() over () rn from test2) foo;

    master_country

    French Polynesia, Colombia, Brazil, Mauritius, Faroe Islands
    (1 row)

  • Nice! What do you do, when the number of rows to pivot is not known? Is there something generic?

  • marcothesanemarcothesane - Select Field - Administrator

    We will soon publish a few use cases for Stored Procedures, which will soon be part of a Vertica 11 subversion.

    I actually have a pivot function that would do that for you in that collection.

    But it's early days just now, and won't work in the current version.

  • marcothesanemarcothesane - Select Field - Administrator

    The principle is this:
    With this table sensors:

    id|msr|val
    41|rpm|1000
    41|tmp|   25
    41|ps_i| 26.32
    

    You need this query:

    SELECT
      id
    , MAX(CASE msr WHEN 'rpm' THEN val END) AS val_rpm
    , MAX(CASE msr WHEN 'tmp' THEN val END) AS val_tmp
    , MAX(CASE msr WHEN 'ps_i' THEN val END) AS val_ps_i
    FROM sensors
    GROUP BY
      id
    ;
    

    You find that out by previously firing a:

    SELECT DISTINCT msr FROM sensors;
    

    .. and then generating as many MAX(CASE msr WHEN '???' THEN val END) AS val_???, as you get distinct values - and incorporate that into the final query.

  • HibikiHibiki Vertica Employee Employee

    I have developed the sample UDx to transform the long-form data to the wide-form data. Please try to use it if you can.

    https://github.com/h-serizawa/vertica-extension-functions/tree/master/long2wide

  • marcothesanemarcothesane - Select Field - Administrator

    Now that Vertica 11.0.1 is out, and with that, support for forward cursors in Stored procedures - with the same table sensors as above, try this:

    Create the test table:

      DROP TABLE IF EXISTS sensors;
      CREATE TABLE sensors (id,msr,val) AS
                SELECT 41,'rpm' ,1000
      UNION ALL SELECT 41,'tmp' ,  25
      UNION ALL SELECT 41,'ps_i',  26.32
      UNION ALL SELECT 42,'rpm' ,1200
      UNION ALL SELECT 42,'tmp' ,  26
      UNION ALL SELECT 42,'ps_i',  21.32
      ;
    

    Then, with a stored procedure, you can pivot the table. Into a target (temporary) table that you can then select from:

      CALL public.pivot(
        'id'        -- list of identifers -> GROUP BY
      , 'msr'       -- key column
      , 'val'       -- value column
      , 'sensors'  -- in table name
      , 'horizontal'-- out, pivoted, temp table name
      );
      -- out NOTICE 2005:  
      -- out     *** Running this Query *** 
      -- out CREATE LOCAL TEMPORARY TABLE horizontal ON COMMIT PRESERVE ROWS AS
      -- out SELECT
      -- out   id
      -- out , MAX(CASE msr WHEN 'ps_i' THEN val END) AS val_ps_i
      -- out , MAX(CASE msr WHEN 'rpm' THEN val END) AS val_rpm
      -- out , MAX(CASE msr WHEN 'tmp' THEN val END) AS val_tmp
      -- out FROM sensors
      -- out GROUP BY
      -- out   id
      -- out  pivot 
      -- out -------
      -- out      0
    
      SELECT * FROM horizontal;
      -- out  id | val_ps_i | val_rpm | val_tmp 
      -- out ----+----------+---------+---------
      -- out  41 |    26.32 | 1000.00 |   25.00
      -- out  42 |    21.32 | 1200.00 |   26.00
    

    The stored procedure is in prototype mode, and actually outputs the query it has generated, and it is going to fire.

    And just for playing purposes: Here's the full source code for the stored procedure:

      DROP PROCEDURE IF EXISTS public.pivot(
        idlist VARCHAR
      , keyname VARCHAR
      , valname VARCHAR
      ,tbname VARCHAR
      ,pvtbname VARCHAR
      ) ;
    
      CREATE PROCEDURE public.pivot (
        IN idlist   VARCHAR -- list of identifers -> GROUP BY
      , IN keyname  VARCHAR -- key column
      , IN valname  VARCHAR -- value column
      , IN tbname   VARCHAR -- in table name
      , IN pvtbname VARCHAR -- out, pivoted, temp table name
      ) LANGUAGE PLpgSQL SECURITY INVOKER AS $$
      DECLARE
        i      INTEGER := 0;
        sq     VARCHAR;
        c      VARCHAR;
        r      VARCHAR;
      BEGIN
        i := 1;
        WHILE SPLIT_PART(idlist,',',i) <> '' LOOP
          IF i = 1 THEN
            sq:= CHR(10)||'CREATE LOCAL TEMPORARY TABLE '||pvtbname||' ON COMMIT PRESERVE ROWS AS'||
                CHR(10)||'SELECT'||CHR(10)||'  '|| SPLIT_PART(idlist,',',i);
          ELSE
            sq:= sq||CHR(10)||', '|| SPLIT_PART(idlist,',',i);
          END IF;
          i := i + 1;
        END LOOP ;
        FOR c IN EXECUTE
        'SELECT DISTINCT '||keyname||' FROM '||tbname||' ORDER BY 1' LOOP
          sq := sq||CHR(10)||', MAX(CASE '||keyname||' WHEN '||QUOTE_LITERAL(c)||' THEN '||valname||' END) AS '||valname||'_'||c;
        END LOOP;
        i := 1;
        sq := sq ||CHR(10)||'FROM '||tbname;
        LOOP
          EXIT WHEN  SPLIT_PART(idlist,',',i) = '';
          IF i = 1 THEN
            sq:= sq||CHR(10)||'GROUP BY'||CHR(10)||'  '|| SPLIT_PART(idlist,',',i);
          ELSE
            sq:= sq||CHR(10)||', '|| SPLIT_PART(idlist,',',i);
          END IF;
          i := i + 1;
        END LOOP ;
        -- vv test output ...
        RAISE NOTICE '
          *** Running this Query *** %',sq;
        -- ^^ test output ...
        EXECUTE sq;
      END;
      $$;
    
    

    Happy playing ....

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file