Removing Line Feed

Is there a easy way to remove line feed during copy? We have a row in a flat file that has line feed character which causes copy to fail. Can we do something in copy?

Comments

  • Hi, Do you know what character is the line feed? Maybe you can use the record terminator option in the copy command. For example this copy will load when the line feed is \n and the text file has header column copy test_table from 'test_file.csv' null '' skip 1 delimiter ',' enclosed by '"' record terminator E'\n' direct; Hope this helps, Eugenia
  • Mohit_AnchliaMohit_Anchlia Registered User
    My record that is failing looks like this: --- hello hello I am \n here\n ---- Above 2 lines are actually just one line. The first '\n' is causing issues. Would the command you gave help? If yes then could you elaborate on what E'\n' means? My current copy command is this: COPY cg.TOKENTABLE_temp_ext FROM LOCAL :input_file GZIP rejected data :rej_file exceptions :err_file DELIMITER ' ' NULL E'\\\N' DIRECT;
  • The E’\N’ means that the end of line is an escape character line feed. Do you see \n in the file or is \n that is a the (Line Feed). If so my example should work.
  • Mohit_AnchliaMohit_Anchlia Registered User
    I see '\n' embedded in a column which is not a real line feed but most likely user hit enter in a text field that we are trying to upload. I see I already have E'\\\N' in my copy command, is that different?: COPY cg.TOKENTABLE_temp_ext FROM LOCAL :input_file GZIP rejected data :rej_file exceptions :err_file DELIMITER ' ' NULL E'\\\N' DIRECT;
  • Yes the one that you have is different, in your command you are indicating that nulls are E'\\\N'. If you have the line feed as \n just use record terminator '\n'. Try like this COPY cg.TOKENTABLE_temp_ext FROM LOCAL :input_file GZIP rejected data :rej_file exceptions :err_file DELIMITER ' ' record terminator '\n' DIRECT;
  • Mohit_AnchliaMohit_Anchlia Registered User
    If you do that wouldn't vertica think that the below line is really 2 lines? However, this needs to be loaded as just one line/row. The first 'I am \n here' is a column. --- hello hello I am \n here\n ----
  • Ah! Sorry I misunderstood what you were looking. --- hello hello I am \n here\n ---- it is just one column, what are the other field? I just notice in your COPY delimiter ' ' what this means is that the data is separated by space so in you example is Col 1 Hello Col 2 hello Col 3 I Col 4 am NEW RECORD Col 1 Here In my example is COL 1 Hello hello I am New Record COL 1 here. So what you want is just "hello hello I am here, my question is 1- How the other columns are separated? 2- How the other records are separated?
  • Mohit_AnchliaMohit_Anchlia Registered User
    Records are separated by new lines and columns are separated by tabs
  • Hi, If the records in your file are separated by new line feed and this particular column have \n and a new line feed, I don't I don't see a way that merge the 2 lines so Vertica can load it as new line. You could clean the \n with a sed command before to load it, but the line feed will be there. The other option could be if you can enclose this column with " or something so you can use the option enclosed by " " and Vertica will treat everything as one column. Hope this make more sense. Eugenia
  • Sashi_1Sashi_1 Registered User
    Hi, If your source is SQL Server, you can use Replace() function on that source column to have all the values together and go as one single entry for that record. Thanks, S
  • Keyur_GondaliaKeyur_Gondalia Registered User
    I am facing the same issue. Even though I have used the enclosed by " (double quotes), vertica is not treating the columns as a single one. It is not able to distinguish between \n of column and new line feed. Is there any solution or patch for this?
  • Can you use the Rfc4180 add-on parser?  As bundled in our SDK.
  • Keyur_GondaliaKeyur_Gondalia Registered User
    I could locate the parser at /opt/vertica/sdk/examples , but not sure how to use it? Please help me on this.
  • If you run "make" (command-line tool) in that directory, it will compile all of the examples.  (You'll need to have "make", as well as the "g++" compiler, installed.  Both are readily available through yum or apt on most Linux systems.)  This will produce a ".so" file for all of the examples, including the Rfc4180 parser.  The "ParserFunctions.sql" file in that directory then contains examples of how to load the library, and how to use it.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file