COPY csv files into Table

Hi,

 

I am loading 1 CSV file having the below format to a table.

 

CSV File Format:

 

COL1,COL2,COL3

HELL,1,ME

"K,

ILL",2,YOU

"H

OW",7,"GO

TO"

 

Expected Table Output :

 

 

COL1   COL2   COL3

HELL    1          ME

K,ILL     2          YOU

HOW    7          GOTO

 

Kindly help me out to resolve this using only COPY command.

 

Till now I have tried this which is giving wrong output .

 

COPY MYTESTTAB(COL1 NULL as 'NotFound',COL2 NULL as 'NotFound',COL3 NULL as 'NotFound') FROM LOCAL 'c:\\c.csv' ENCLOSED BY '"' DELIMITER ',' RECORD TERMINATOR E'\n' SKIP 1

 

 

COL1   COL2   COL3

HELL    1          ME
ILL"       2         YOU
OW"     7          GO

 

Comments

  • SruthiASruthiA Vertica Employee Administrator

    Hi,

     

       I created a csv file called test.csv and loaded it using COPY from LOCAL as below.

     

    COPY copycsv(COL1 NULL as 'NotFound',COL2 NULL as 'NotFound',COL3 NULL as 'NotFound') FROM LOCAL 'test.csv' ENCLOSED BY '"' DELIMITER ',' record terminator E'\n' direct ENFORCELENGTH NO ESCAPE;

     

    I hope this is your desired output.

    Selection_198.png

     

     

    -Regards,

     Sruthi

  • Hi Sruthi,

     

    Thanks for your time for the below issue.

     

    But  its still not resolved in my end.

     

    My table output is :

     

    HELL    1    ME
    ILL"    2    YOU
    OW"    7    GO

     

    Have you loaded the attached CSV file ?

  • SruthiASruthiA Vertica Employee Administrator

    Hi,

     

       I have loaded the following csv file.

    Selection_199.png

     

    Since we are having new line as record terminator, the file you attached is not valid.

     

    -Regards,

     Sruthi

  • But this is the requirement.

     

    I need to load that file only.

     

     

    I can resolve that in shell scripting but requirement is only copy command in vertica

     

     

     

  • So this means you can only do it from inside the vsql ? 

  • Yes Adrian,

     

    Only using COPY command I need to resolve this .

  • Hi,

     

    Once again I am writing the requirement.

     

    CSV File Format:

     

    COL1,COL2,COL3

    HELL,1,ME

    "K,

    ILL",2,YOU

    "H

    OW",7,"GO

    TO"

     

    Expected Table Output :

     

     

    COL1   COL2   COL3

    HELL       1          ME

    K,\nILL    2          YOU

    H\nOW    7          GO\nTO

     

    Kindly help me out to resolve this using only COPY command.

  • marcothesanemarcothesane - Select Field - Administrator

    Hi deb0687 -

    You hit the common problem of the record delimiter within the string. As far as I know, that problem can't be resolved using the COPY command.

    The best solution would be to get a new version of the file with a different record delimter than the newline - by profiling the strings in the table to find characters that don't exist anywhere - and load that file with that record delimiter specified.

    If you can't - then, you need a program that automatically concatenates two or more adjacent lines of the data file as long as the number of string enclosing characters (single or double quote, usually) in the line, so far, is odd, before passing the line to Vertica.

     

    I wrote my own ODBC SQL client that can do that, and I load these files as a one-off this way.

     

    Good luck -

    Marco

    (I work for HP)

  • Hi Marco,

    Its possible in copy command.
    There is a option called PARSER. I am trying to do that.

    Please try to get it using this.

    Regards,
    Deb
  • It belive Marco is aware of the parser that can be used in the copy command, but i don't think you can fix it with copy.

    And if you can !? That is going to be some NINJA COPY  skill :) !!! 

     

  • marcothesanemarcothesane - Select Field - Administrator

    Hi Deb -

    Yes - I went through the Docu. Looks like you can write your own User Defined Parser - which would work like the DELIMITED, default, parser, but would read lines a bit differently:

    It would read a line, and count the number of string delimiters / enclosers. If the number is even, good. If the number is odd, it would repeat { read next line and concat it with the one already read } until the concatenated line contains an even number of string delimiters.

    Only then, the splitting into column literals would take place.

    Not exactly a stroll in the park, but maybe worth the try ... I did it myself - but I use a parametrised INSERT [/*+DIRECT */] statement - which, on server side, also maps to a COPY process.

    Cheers - marco

  • Hi Marco,

     

    Can you share the complete script here.

     

    Requirement :

     

    CSV File Format:

     

    COL1,COL2,COL3

    HELL,1,ME

    "K,

    ILL",2,YOU

    "H

    OW",7,"GO

    TO"

     

    Expected Table Output :

     

     

    COL1   COL2   COL3

    HELL       1          ME

    K,\nILL    2          YOU

    H\nOW    7          GO\nTO

  • marcothesanemarcothesane - Select Field - Administrator

    Deb -

    that's not a script;

    That's a C program using the ODBC API, of 6091 lines, that I wrote myself - and I've been optimising it for the last 20 years.

    There is no support for it or anything.

     

    And writing your own parser would also be a major task: You'd have to write and test a User Defined Loader function in C++; that would be roughly a one-week project for a proficient C programmer.

     

    I would rather insist that they re-create the file for you, using a different record delimiter, as said earlier.

    good luck-

    marco

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file