Why DBvisualizer and Vsql treat the length of a special character in different way?

AcisnerosAcisneros Vertica Customer

I have a great doubt, I'm new in Vertica & I'm trying to insert the string (hóla) but I got the next error in dbvisualizer:

My scenario in dbvisualizer

create table db_sieventas.test (
word varchar (4));

insert into db_sieventas.test values ('hóla');

[Code: 8682, SQL State: 22001] [Vertica]VJDBC ERROR: String of 5 octets is too long for type varchar(4) for column word

I take the lenght:
select octet_length('¿hóla'); = 5

==============================================
The same scenaerio in Vsql

In Vsql it'scompletly different here the result

uvtc840079=> select octet_length('hóla');
octet_length


         4

(1 row)

uvtc840079=> insert into db_sieventas.test values ('hóla');
OUTPUT


   1

(1 row)

In Vsql I could insert the row I'm confused, Can you help me?

Answers

  • SergeBSergeB - Select Field - Employee

    What is the default character encoding on the Operating System you are running VSQL from?

  • s_crossmans_crossman Vertica Employee Employee
    edited June 2021

    Acisneros,
    One thing I wanted to note is that in your supplied examples you did octel_length of two different strings.
    DBVis = select octet_length('¿hóla'); = 5
    VSQL = select octet_length('hóla');

    That aside, in my vsql 9.3 for the 'hóla' string I get 5, same in DBVis 12 on Windows 10 with Vertica JDBC 9.3 driver. So for me both report the same octet length of 5.

    I do see the error you see when inserting into the table, octet length 5 too long for varchar(4). This error is correct. Please refer to the following two Vertica doc pages and the sections I've extracted related to what you are seeing. Although not intuitive, when you define a varchar the length you define is octets not characters. So for a 5 octet string you'd need varchar(5). If you plan on loading a lot of foreign characters you may need to increase the varchar size to accommodate.

    https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/SQLReferenceManual/DataTypes/CharacterDataTypes.htm
    Setting Maximum Length
    When you define character columns, specify the maximum size of any string to be stored in a column. For example, to store strings up to 24 octets in length, use one of the following definitions:

    CHAR(24) /* fixed-length /
    VARCHAR(24) /
    variable-length */
    The maximum length parameter for VARCHAR and CHAR data types refers to the number of octets that can be stored in that field, not the number of characters (Unicode code points). When using multibyte UTF-8 characters, the fields must be sized to accommodate from 1 to 4 octets per character, depending on the data. If the data loaded into a VARCHAR/CHAR column exceeds the specified maximum size for that column, data is truncated on UTF-8 character boundaries to fit within the specified size.

    https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/SQLReferenceManual/Functions/String/OCTET_LENGTH.htm
    Parameters
    expression
    (CHAR or VARCHAR or BINARY or VARBINARY) is the string to measure.

    Notes
    If the data type of expression is a CHAR, VARCHAR or VARBINARY, the result is the same as the actual length of expression in octets. For CHAR, the length does not include any trailing spaces.

    I hope it helps,

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file