Unable to load strings with double quotes

I am uploading a csv file onto vertica, with a following command- copy FROM LOCAL

Comments

  • Hi Anuja, Could you give an example of a string that you're trying to load? There are a couple of things that match this description. One would be resolved by setting a different ENCLOSED BY character. Another would be resolved by using the Rfc4180 parser included in the SDK examples (/opt/vertica/sdk/examples/). Thanks, Adam
  • Thanks Adam, The text we are trying to upload is - ZAO "COMPUWAY" We specifically used enclosed by in our command to get blank values and values with "," in them into vertica. we are uploading a .csv using a cloud server. We tried using quite a few other characters in the enclosed by statement. its a huge data dump, We are as of not just missing out on data specifically to above name which is around 1000 rows. Thanks, Anuja.
  • Hi Anuja, Hm... So, if you have that value, what does it look like in a field? some,other,fields,"ZAO "COMPUWAY"",... ? Or maybe one of the following: some,other,fields,"ZAO \"COMPUWAY\""",... some,other,fields,"ZAO ""COMPUWAY""",... If it's the first one above, unfortunately we can't readily parse data in that format. It's not well-defined -- we can't tell which quote is supposed to mean end-of-field and which is part of the string. If your data is already in this format, one option is to use Vertica's rejection files -- load as many rows as you can (and invoke COPY specifying a location to save rejected data so that you can find it), then open the rejection file, hand-edit the remaining records to remove some corruption (if it's literally just the above string you can probably just open the file in a text editor and do a global search-and-replace to add backslashes like the second example above) and try again. Repeat until the data's fully loaded. Ugly, but effective. For the general case, it's very hard to load data if the meaning of a character (whether it's a special character or just part of the string) is ambiguous. There are a couple common ways to encode strings in CSV files to avoid this; Vertica is able to parse both of the approaches above, though the second requires loading a parser plugin. Adam
  • Hm... The first row actually can be parsed with FILER
  • Thanks a lot, we will remove the double quotes in that case. we wanted to automate the process by a command. but i am guessing that wont be possible. However, because its entries for this specific name alone, we can manually edit them. Anuja

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file