loading text file into vertica and finding an issue with spaces between the values in columns

I am loading a text file into vertica using the copy command . It seems like the columns containing data in the database have spaces between the characters in the values.

copy test_schema.tabOne from local'testFile.txt' delimiter E'\t';


Comments

  • sample text file tab-delimited data:
    //+0DsAW99eYSL8H //+0DsAW99eYSL8H H126872739 0 Closer 2013-06-30 14:26:44.113000000 www.hp.com hp search  HP Internal Traffic Internal Internal HP Traffic    California Sebastopol   //+0DsAW99eYSL8H ProductPurchase 2013-06-30 16:32:46 1060.98 1060.98 1060.98 1060.98 0 0 0 0   CA 94951 1060.98 1060.98 1 1060.98 1060.98 1 1060.98 1060.98 1 //0b/V97999vlzkH //0b/V97999vlzkH H126833848 0 Closer 2013-06-25 14:30:51.117000000   search  Direct Direct Missing     Michigan Washington 48094  //0b/V97999vlzkH ProductPurchase 2013-06-25 14:30:51 89.99 89.99 89.99 89.99 0 0 0 0   MI 48042 89.99 89.99 1 89.99 89.99 1 89.99 89.99 1 //1f/sDi99Y1Apku //1f/sDi99Y1Apku H127564102 0 Closer 2013-09-18 09:38:50.077000000 www8.hp.com hp search  HP Internal Traffic Internal Internal HP Traffic    United States United States US     //1f/sDi99Y1Apku ProductPurchase 2013-09-18 10:17:27 44.98 44.98 44.98 44.98 0 0 0 0   MO 65024 44.98 44.98 1 44.98 44.98 1 44.98 44.98 1 
  • The way it looks in the database:


  • Please check the attached PNG. file. Can someone check the image and let me know why there are spaces between the characters in the columns of the field?

    Thanks
  • Hi Prasanna,

    I unfortunately can't see the attached PNG image...

    A couple thoughts here:

    - Is this possibly just an issue with output formatting?  You could look at the options in vsql (or in whatever tool you're using) to make sure it's displaying correctly.

    - Is the input file in UTF-8 format?  Vertica requires UTF-8 and may get confused by files in other encodings.  For example, if you load a UTF-16 file (and you're using mostly basic English characters), you'll see a null byte between each character, since for ASCII characters, UTF-16 looks exactly like UTF-8 with every other byte null.  (If you want to convert file formats, our "Iconverter" SDK example can do that for you as part of COPY; the "iconv" tool can do so at the command line.)

    Adam
  • Thanks Adam, the file is in utf-16 format and I have converted it to utf-8 using "iconv" tool at the command line. The data looks fine in the db. I am just validating to see if there's a loss of data due to the conversion.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file