ERROR 4800: String of 38 octets is too long for type Varchar(32)

Hi!

Each time it is a struggle to find what column is to small to fit loading data.
Especially on tables with large number of columns with same datatype, like varchar(32).
And when loading data is huge, it is quite a project to find what column is too small.

Why not to report column name that is too short, along with data types? That would make my life slightly easier. (Some other Veritca customers can feel same way).

That error is not specific to parser, it is from layer that checks data supplied by parser.

Please file enhancement request, to report column name for error 4800. Of course fix should not decrease data load performance.

Thank you!

Answers

  • Same here:

    ERROR 5861:  Error calling process() in User Function UDParser at [/home/scherepanov/git/vertica/extern/include/verticaudx/verticaudx1111/VerticaUDx.h:5682], error code: 0, message: User code caused Vertica to throw exception "Incorrect use of setter in processBlock"
    

    Table has 75 columns... try to guess where is setter of incorrect type...
    Why not to add column name to error, or at least column index passed to setter...

  • SruthiASruthiA Vertica Employee Administrator
    edited January 2023

    @Sergey_Cherepan_1 : Are you getting this error for direct COPY statement or UDX? this error message was improved in 9.1.0-2

    https://www.vertica.com/docs/ReleaseNotes/9.1.x/Vertica_9.1.x_Release_Notes.htm#9.1.0-2

    VER-62144 Error Handling, Execution Engine
    If you tried to insert data that was too wide for a VARCHAR or VARBINARY column, the error message did not specify the column name. This error message now includes the column name.

    I just tested it on 12.0.2 for insert statement.

    eonv1202u=> create table abc ( a varchar(2), b varchar(2), c varchar(2), d varchar(2), e varchar(2), f varchar(2) );
    CREATE TABLE
    eonv1202u=> insert into abc values ( 'aa', 'bb', 'cc', 'dd', 'eee', 'ff');
    ERROR 8682: String of 3 octets is too long for type Varchar(2) for column e
    eonv1202u=>

  • Errors are from UDParser. I am using v 12.0.2. Column names are not included in error test... pretty much everywhere in VerticaUDx.h.

    Here is one more error I just got:

    ERROR 5861: Error calling process() in User Function UDParser at [/home/scherepanov/git/vertica/extern/include/verticaudx/verticaudx12022/VerticaUDx.h:5748], error code: 0, message: User code caused Vertica to throw exception "Cannot set NOT NULL field to NULL"

    Looking at line 5748 at VerticaUDx.h:

        if (enforceNotNullProperty && !inTypes[idx].props.canBeNull) {
            ereport(ERROR, (errmsg("Cannot set NOT NULL field to NULL"),
                        errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE)));
        }
    

    It is easy to add to error message at least idx, or better column name. Please file enhancement request to improve error messages in VerticaUDx.h.

    Same here, for error "incorrect use of setter", line 5726 in same file:

    void throwIncorrectUsageError() {
           ereport(ERROR,
                    (errmsg("Incorrect use of setter in processBlock"),
                    errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE)));
    }
    

    Please ask to add at least column index to error text, better column name.

  • Just made changes to VerticaUDx.h, now reporting column index and column name in error:

        if (enforceNotNullProperty && !inTypes[idx].props.canBeNull) {
            ereport(ERROR, (errmsg("Cannot set NOT NULL field to NULL, field idx %ld name %s",
                                   idx, inTypes[idx].getName().c_str()),
                        errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE)));
        }
    

    Now, error message looks much better:

    ERROR 5861: Error calling process() in User Function UDParser at [/home/scherepanov/git/vertica/extern/include/verticaudx/verticaudx12022/VerticaUDx.h:5748], error code: 0, message: User code caused Vertica to throw exception "Cannot set NOT NULL field to NULL, field idx 0 name dt"

    Would be very nice if Vertica post VerticaUDx API to github. Anyway, full VerticaUDx code is included in each Vertica install. I would be committing changes, there are quite a few places where I want to do improvements...

  • Sergey_Cherepan_1Sergey_Cherepan_1 ✭✭✭
    edited February 2023

    This error message (without column name or index) is from Vertica loading path, for and copy command.

    Here is reproducer.

    create table test.t (v varchar(10));
    echo blablablablabla | /opt/vertica/bin/vsql -c "copy test.t from stdin abort on error"
    /opt/vertica/bin/vsql -c "select * from test.t"

    v

    blablablab
    (1 row)

    Vertica default FCSVPARSER parser is trimming long strings by default. It is a little bit more work to force Vertica to actually load full string:

    echo blablablablabla | /opt/vertica/bin/vsql -c "copy test.t (z filler varchar(20), v as z) from stdin abort on error"
    ERROR 4800: String of 15 octets is too long for type Varchar(10)

    And here we go, this error from regular Veritca copy command. No column index / column name in error text.

    As correctly said in VER-62144, column name is present in error text for INSERT command.

    COPY command left out in a cold.

  • moshegmosheg Vertica Employee Administrator

    Did you try the following COPY statement which shows the column name?

    create table test001 (v varchar(10));
    copy test001 from stdin ENFORCELENGTH REJECTED DATA AS TABLE rejected001 abort on error;
    123456789012345
    \.
    vsql:test.sql:7: ERROR 2035:  COPY: Input record 1 has been rejected (The 15-byte value is too long for type Varchar(10), column 1 (v))
    select * from test001;
     v
    ---
    (0 rows)
    
    select * from rejected001;
        node_name     | file_name |            session_id             |  transaction_id   | statement_id | batch_number | row_number |  rejected_data  | rejected_data_orig_length |                         rejected_reason
    ------------------+-----------+-----------------------------------+-------------------+--------------+--------------+------------+-----------------+---------------------------+------------------------------------------------------------------
     v_eevdb_node0001 | STDIN     | v_eevdb_node0001-3222224:0x15fb92 | 45035996273973428 |            1 |            0 |          1 | 123456789012345 |                        15 | The 15-byte value is too long for type Varchar(10), column 1 (v)
    (1 row)
    
  • Sergey_Cherepan_1Sergey_Cherepan_1 ✭✭✭
    edited February 2023

    Thanks for pointing out.
    Looks like there are several places where error like that is being reported, and seems to be in most cases error reporting is good.
    Unfortunately, all of Vertica UDx c++ API does not have good reporting. I am hitting error messages from UDParser and from setNull, without column name/index. If you will look at Vertica UDx c++ API code, pretty much everywhere error reporting is not up to Vertica standards.
    That to say, it is not end of the world, I still can trace error down to column, just taking more time than it should.

    BTW I can fix error reporting in UDx API very easily... code is in every Vertica installation... would be nice if I can commit my changes somewhere, and Vertica can incorporate them in code.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file