Loading special characters using copy

Hi, I have a file that gets spanish characters. When I try to load it using Vertica copy, it gets loaded as a different character - "�"
What is the way to retain the same character. Example: "Fénix".

The file is utf8. And the target column is Varchar

Thanks
NJ

Comments

  • Found the issue. It is with the DB browsers/GUI. Vsql shows that the data is correct. Now the question is - why does DB browsers show it differently, is it because of the drivers ?
  • Hi Navneet,

    This link might help you with settings in your SQL IDE.
    Chinese characters in Vertica

    BTW, which SQL IDE are you using?

    Hope this helps.
    NC

  • Vertica requires incoming data to be UTF-8 and to avoid load performance overhead we don't check if it isn't. So it's up to the dba to confirm utf-8 compliance of the source files before loading. You can use the Linux file command on the source file to determine the encoding of the data. E.g.
    $ file data*
    data.txt: ISO-8859 text

    If the encoding of the data in the source file is not utf-8 then you can use the iconv command to covert it. E.g.
    iconv -f ISO88599 -t utf-8 data.txt > data-utf8.txt

    Check the below also, I did a small test, it is working fine:
    ----------------------------------------------------------------------------
    dbadmin=> create table test_spanish(str Varchar(30));
    CREATE TABLE
    dbadmin=> insert into test_spanish values ('Fénix');
     OUTPUT
    --------
          1
    (1 row)

    dbadmin=> commit;
    COMMIT
    dbadmin=> select * from test_spanish;
      str 
    -------
     Fénix
    (1 row)

    dbadmin=> show locale;
      name  |               setting               
    --------+--------------------------------------
     locale | en_US@collation=binary (LEN_KBINARY)
    (1 row)

    dbadmin=> \q
    ppal1:/home/dbadmin $ locale
    LANG=en_US.UTF-8
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=
    ppal1:/home/dbadmin $




  • Thanks guys, 
    Prasanta, as I wrote. The issue appears to be with the GUI, nothing wrong with Vertica per se 

    I tried with eclipse DBeaver and Sql workbench.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file