COPY csv files into Table
Hi,
I am loading 1 CSV file having the below format to a table.
CSV File Format:
COL1,COL2,COL3
HELL,1,ME
"K,
ILL",2,YOU
"H
OW",7,"GO
TO"
Expected Table Output :
COL1 COL2 COL3
HELL 1 ME
K,ILL 2 YOU
HOW 7 GOTO
Kindly help me out to resolve this using only COPY command.
Till now I have tried this which is giving wrong output .
COPY MYTESTTAB(COL1 NULL as 'NotFound',COL2 NULL as 'NotFound',COL3 NULL as 'NotFound') FROM LOCAL 'c:\\c.csv' ENCLOSED BY '"' DELIMITER ',' RECORD TERMINATOR E'\n' SKIP 1
COL1 COL2 COL3
HELL 1 ME
ILL" 2 YOU
OW" 7 GO
0
Comments
Hi,
I created a csv file called test.csv and loaded it using COPY from LOCAL as below.
COPY copycsv(COL1 NULL as 'NotFound',COL2 NULL as 'NotFound',COL3 NULL as 'NotFound') FROM LOCAL 'test.csv' ENCLOSED BY '"' DELIMITER ',' record terminator E'\n' direct ENFORCELENGTH NO ESCAPE;
I hope this is your desired output.
-Regards,
Sruthi
Hi Sruthi,
Thanks for your time for the below issue.
But its still not resolved in my end.
My table output is :
HELL 1 ME
ILL" 2 YOU
OW" 7 GO
Have you loaded the attached CSV file ?
Hi,
I have loaded the following csv file.
Since we are having new line as record terminator, the file you attached is not valid.
-Regards,
Sruthi
But this is the requirement.
I need to load that file only.
I can resolve that in shell scripting but requirement is only copy command in vertica
So this means you can only do it from inside the vsql ?
Yes Adrian,
Only using COPY command I need to resolve this .
Hi,
Once again I am writing the requirement.
CSV File Format:
COL1,COL2,COL3
HELL,1,ME
"K,
ILL",2,YOU
"H
OW",7,"GO
TO"
Expected Table Output :
COL1 COL2 COL3
HELL 1 ME
K,\nILL 2 YOU
H\nOW 7 GO\nTO
Kindly help me out to resolve this using only COPY command.
Hi deb0687 -
You hit the common problem of the record delimiter within the string. As far as I know, that problem can't be resolved using the COPY command.
The best solution would be to get a new version of the file with a different record delimter than the newline - by profiling the strings in the table to find characters that don't exist anywhere - and load that file with that record delimiter specified.
If you can't - then, you need a program that automatically concatenates two or more adjacent lines of the data file as long as the number of string enclosing characters (single or double quote, usually) in the line, so far, is odd, before passing the line to Vertica.
I wrote my own ODBC SQL client that can do that, and I load these files as a one-off this way.
Good luck -
Marco
(I work for HP)
Its possible in copy command.
There is a option called PARSER. I am trying to do that.
Please try to get it using this.
Regards,
Deb
It belive Marco is aware of the parser that can be used in the copy command, but i don't think you can fix it with copy.
And if you can !? That is going to be some NINJA COPY skill !!!
Hi Deb -
Yes - I went through the Docu. Looks like you can write your own User Defined Parser - which would work like the DELIMITED, default, parser, but would read lines a bit differently:
It would read a line, and count the number of string delimiters / enclosers. If the number is even, good. If the number is odd, it would repeat { read next line and concat it with the one already read } until the concatenated line contains an even number of string delimiters.
Only then, the splitting into column literals would take place.
Not exactly a stroll in the park, but maybe worth the try ... I did it myself - but I use a parametrised INSERT [/*+DIRECT */] statement - which, on server side, also maps to a COPY process.
Cheers - marco
Hi Marco,
Can you share the complete script here.
Requirement :
CSV File Format:
COL1,COL2,COL3
HELL,1,ME
"K,
ILL",2,YOU
"H
OW",7,"GO
TO"
Expected Table Output :
COL1 COL2 COL3
HELL 1 ME
K,\nILL 2 YOU
H\nOW 7 GO\nTO
Deb -
that's not a script;
That's a C program using the ODBC API, of 6091 lines, that I wrote myself - and I've been optimising it for the last 20 years.
There is no support for it or anything.
And writing your own parser would also be a major task: You'd have to write and test a User Defined Loader function in C++; that would be roughly a one-week project for a proficient C programmer.
I would rather insist that they re-create the file for you, using a different record delimiter, as said earlier.
good luck-
marco