Strange Behavior with Whitespace String

mark_d_drakemark_d_drake Community Edition User
edited June 2022 in General Discussion

Is this expected...

VMart=> create table VARCHAR_TEST (X VARCHAR2(32));
CREATE TABLE

VMart=> insert into VARCHAR_TEST values (REPEAT('A',32));
 OUTPUT
--------
      1
(1 row)


VMart=> insert into VARCHAR_TEST values (REPEAT('A',33));
ERROR 8682:  String of 33 octets is too long for type Varchar(32) for column X

VMart=> insert into VARCHAR_TEST values (REPEAT(' ',33));
 OUTPUT
--------
      1
(1 row)


VMart=> SELECT X, LENGTH(x), NVL2(X,'NOT-NULL',NULL) from VARCHAR_TEST
VMart-> ;
                X                 | LENGTH |   NVL2
----------------------------------+--------+----------
                                  |     32 | NOT-NULL
 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA |     32 | NOT-NULL
(2 rows)

I was not allowed to insert 33 'A's into VARCHAR_TEST - Expected
I was allowed to insert 33 'SPACE' characters into VARCHAR_TEST - Unexpected
I ended up with 32 Space characters - sort of exected in the previous behavoir is correct...

But why trim a string of whitespace but throw on a string of non-whitepace.

NB Oracle and Postgres both reject a string of 33 space characters..

Before anyone asks, I am try to reconcile differences in behavior observed when porting YADAMU's corner case test suite for YADAMU from Oracle, Postgres, SQLServer, DB2, MySQL, Snowflake to Vetica

Answers

  • Bryan_HBryan_H Vertica Employee Administrator

    INSERT is defined by SQL:1999 so must behave according to ANSI standard.
    COPY is vendor-specific. The default behavior for Vertica's implementation of COPY is to truncate strings and rows. Add the ENFORCELENGTH keyword to enforce string and row length. Please see details at https://www.vertica.com/docs/11.1.x/HTML/Content/Authoring/DataLoad/CapturingLoadExceptionsAndRejections.htm?zoom_highlight=ENFORCELENGTH

  • mark_d_drakemark_d_drake Community Edition User

    Bryan

    What about the fact that a string consisting only of spaces is 'trimmed' while a string of non-space characters raises an error. I wasn't using COPY in this example.

  • mark_d_drakemark_d_drake Community Edition User
    edited June 2022

    I tried adding ENFORCELENGTH in places where i do use COPY

    eg

    copy "t_postgres1"."character_types" ("character_col","character_varying_col","character_max_col","character_varying_max_col","character_4000_col","character_varying_4000_col","character_64k_col","character_varying_64k_col","text_col") from '/mnt/shared/stagingArea/db/vertica/YST-727BBF569788CFD4D435803BB357606A-00001.001' PARSER fcsvparser(type='rfc4180', header=false, trim=false) NULL '' ENFORCELENGTH REJECTED DATA AS TABLE "YRT-B13A7ACF6BDA853D173B2622ED572CBD"  NO COMMIT;   /* [WRITER] Worker(1)] */
    

    and I got a notice..

    Parameter enforcelength was not registered by the function and cannot be coerced to a definite data type
    

    Does not seem to cause errors

  • Bryan_HBryan_H Vertica Employee Administrator
    edited June 2022

    For the COPY error: ENFORCELENGTH isn't compatible with fcsvparser but is allowed as a column modifier, e.g.:
    copy varchar8 ("v" ENFORCELENGTH) from ...
    Overall, this may be a documentation issue, I will do more research and testing compared to other systems - SQL Server has an explicit setting ANSI_PADDING around trailing whitespace, for example - however, we were built on PostgreSQL and v9 does this by default also:

    psql (9.2.24)
    postgres=# create table varchar8 (v varchar(8));
    CREATE TABLE
    postgres=# insert into varchar8 values ('12345678');
    INSERT 0 1
    postgres=# insert into varchar8 values ('123456789');
    ERROR: value too long for type character varying(8)
    postgres=# insert into varchar8 values (' 9');
    ERROR: value too long for type character varying(8)
    postgres=# insert into varchar8 values (' ');
    INSERT 0 1
    postgres=# select '|'||v||'|' from varchar8 ;

    ?column?

    |12345678|
    | |
    (2 rows)

  • mark_d_drakemark_d_drake Community Edition User

    Actually, I just rechecked postgres and stand corrected. Even with 14.1 it shows the same behavior as Vertica, in that trailing whitespace is truncated to the length of the column, as does MySQL. So it appears the Oracle is actually the outlier here.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file