Export table as RFC 4180 CSV

Is there a way to export a Vertica table as an RFC 4180 CSV, with just strings that contain embedded commas, quotes, line breaks, etc. enclosed by quotes, and embedded quotes escaped as double quotes?
I see this article which shows how to export with simple comma delimited values and use a unix command to enquote everything, but can this handle more general strings and not enquote columns that are not VARCHAR?
https://www.vertica.com/blog/create-a-csv-file-with-fields-enclosed-by-quotes/

Tagged:

Answers

  • marcothesanemarcothesane - Select Field - Administrator

    I think it's a matter of SQL generating SQL, taking advantage of Vertica's string over-standard function library.
    Creating an exemplary test table:

    -- create a guinea pig table with known RFC 4180 special cases
    DROP TABLE IF EXISTS public.to_export;
    CREATE TABLE public.to_export(id,title,special_case) AS 
              SELECT  1,'They called him "horse"'
                   , 'embedded double quote'
    UNION ALL SELECT  2,'The Hitch Hiker''s Guide'||CHR(10)||'to the Galaxy' 
                   , 'embedded newline and single quote'
    UNION ALL SELECT  3,'The bar called "Harry''s Bar"'
                   , 'both types of quotes'
    ;
    

    Now generate the SQL you will finally run. Put the below into a SQL file called gen_exp.sql.

    -- now, SQL generating SQL
    SELECT
      CASE ROW_NUMBER() OVER f
        WHEN 1 
        THEN 'SELECT'||CHR(10)||'       '
        ELSE                    '||'',''||'
      END
    ||'QUOTE_IDENT('||column_name||'::VARCHAR)'
    ||CASE ROW_NUMBER() OVER b
        WHEN 1
        THEN CHR(10)||'FROM '||table_schema||'.'||table_name||';'
        ELSE ''
      END
    FROM columns
    WHERE table_schema='public' AND table_name='to_export'
    WINDOW
      f AS (ORDER BY ordinal_position)
    , b AS (ORDER BY ordinal_position DESC)
    ORDER BY ordinal_position
    ;
    

    Testing:

    $ vsql -Atf gen_exp.sql
    SELECT
           QUOTE_IDENT(id::VARCHAR)
    ||','||QUOTE_IDENT(title::VARCHAR)
    ||','||QUOTE_IDENT(special_case::VARCHAR)
    FROM public.to_export;
    

    running:

    $ vsql -Atf gen_exp.sql | vsql -At
    "1","They called him ""horse""","embedded double quote"
    "2","The Hitch Hiker's Guide
    to the Galaxy","embedded newline and single quote"
    "3","The bar called ""Harry's Bar""","both types of quotes"
    

    If you need a title line in the output, shout. If you need numerics to not be in quotes, shout, too. Both of those can be generated with the help of the columns system table, too.

  • marcothesanemarcothesane - Select Field - Administrator

    Couldn't edit it and had some time. Might come in handy for myself in the future ....
    Here goes the whole SQL generating SQL script. I check for int, numeric or float as basic data types for refraining from quoting the literals, and rely, for now getting away with it, to receive the rows in a UNION SELECT in the order I placed the various SELECTs into the overall UNION SELECT statement - so that the generation of the title line happens before the generation of the row lines ...

    SELECT
      CASE ROW_NUMBER() OVER f
        WHEN 1 
        THEN 'SELECT'||CHR(10)||'      '
        ELSE                    '||'',''||'
      END
    ||QUOTE_LITERAL(column_name)
    ||CASE ROW_NUMBER() OVER b
        WHEN 1
        THEN CHR(10)||'UNION ALL'
        ELSE ''
      END
    FROM columns
    WHERE table_schema='public' AND table_name='to_export'
    WINDOW
      f AS (ORDER BY ordinal_position)
    , b AS (ORDER BY ordinal_position DESC)
    ORDER BY ordinal_position
    ;
    
    SELECT
      CASE ROW_NUMBER() OVER f
        WHEN 1 
        THEN 'SELECT'||CHR(10)||'       '
        ELSE                    '||'',''||'
      END
    ||CASE data_type_id
        WHEN  6 THEN column_name  -- int
        WHEN  7 THEN column_name  -- float
        WHEN 16 THEN column_name  -- numeric
        ELSE 'QUOTE_IDENT('||column_name||'::VARCHAR)'
      END
    ||CASE ROW_NUMBER() OVER b
        WHEN 1
        THEN CHR(10)||'FROM '||table_schema||'.'||table_name||';'
        ELSE ''
      END
    FROM columns
    WHERE table_schema='public' AND table_name='to_export'
    WINDOW
      f AS (ORDER BY ordinal_position)
    , b AS (ORDER BY ordinal_position DESC)
    ORDER BY ordinal_position
    ;
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file