Options

Loading special characters using copy

Hi, I have a file that gets spanish characters. When I try to load it using Vertica copy, it gets loaded as a different character - "�"
What is the way to retain the same character. Example: "Fénix".

The file is utf8. And the target column is Varchar

Thanks
NJ

Comments

  • Options
    Found the issue. It is with the DB browsers/GUI. Vsql shows that the data is correct. Now the question is - why does DB browsers show it differently, is it because of the drivers ?
  • Options
    Navin_CNavin_C Vertica Customer
    Hi Navneet,

    This link might help you with settings in your SQL IDE.
    Chinese characters in Vertica

    BTW, which SQL IDE are you using?

    Hope this helps.
    NC

  • Options
    Prasanta_PalPrasanta_Pal - Select Field - Employee
    Vertica requires incoming data to be UTF-8 and to avoid load performance overhead we don't check if it isn't. So it's up to the dba to confirm utf-8 compliance of the source files before loading. You can use the Linux file command on the source file to determine the encoding of the data. E.g.
    $ file data*
    data.txt: ISO-8859 text

    If the encoding of the data in the source file is not utf-8 then you can use the iconv command to covert it. E.g.
    iconv -f ISO88599 -t utf-8 data.txt > data-utf8.txt

    Check the below also, I did a small test, it is working fine:
    ----------------------------------------------------------------------------
    dbadmin=> create table test_spanish(str Varchar(30));
    CREATE TABLE
    dbadmin=> insert into test_spanish values ('Fénix');
     OUTPUT
    --------
          1
    (1 row)

    dbadmin=> commit;
    COMMIT
    dbadmin=> select * from test_spanish;
      str 
    -------
     Fénix
    (1 row)

    dbadmin=> show locale;
      name  |               setting               
    --------+--------------------------------------
     locale | en_US@collation=binary (LEN_KBINARY)
    (1 row)

    dbadmin=> \q
    ppal1:/home/dbadmin $ locale
    LANG=en_US.UTF-8
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=
    ppal1:/home/dbadmin $




  • Options
    Thanks guys, 
    Prasanta, as I wrote. The issue appears to be with the GUI, nothing wrong with Vertica per se 

    I tried with eclipse DBeaver and Sql workbench.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file