COPY Statement not rejecting records when FILLER is being used

Hi,
We are reading data from a CSV and loading it into a vertica table using COPY command. The target table has a column set to NOT NULL. In COPY statement, we are using FILLER to transform data. This transformed data is then loaded into NOT NULL column. If there is null in CSV, the query fails with error like 'Cannot insert into NOT NULL column'.

However when we remove the FILLER and run the query, it sends the NULL value row to rejected table.

Can we have the same functionality while using FILLER in COPY query?

Below are the queries

CREATE TABLE CUST_CONTACT
(
Name varchar(80),
Contact varchar(15) NOT NULL
);

COPY CUST_CONTACT
(
Name,
v2 filler VARCHAR(15),
Contact as REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3')
) from 'Test5.csv' skip 1 delimiter ','
rejected data as table errortest3;

Thanks.

Answers

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited November 30

    Set the CopyFaultTolerantExpressions parameter to 1!

    Example:

    dbadmin=> \! cat /home/dbadmin/Test5.csv
    TEST,TEST
    TEST,
    
    dbadmin=> COPY CUST_CONTACT
    dbadmin-> (
    dbadmin(> Name,
    dbadmin(> v2 filler VARCHAR(15),
    dbadmin(> Contact as REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3')
    dbadmin(> ) from '/home/dbadmin/Test5.csv' skip 1 delimiter ','
    dbadmin-> rejected data as table errortest3;
    ERROR 2501:  Cannot set a NOT NULL column (Contact) to a NULL value in COPY statement
    
    dbadmin=> select * from errortest3;
     node_name | file_name | session_id | transaction_id | statement_id | batch_number | row_number | rejected_data | rejected_data_orig_length | rejected_reason
    -----------+-----------+------------+----------------+--------------+--------------+------------+---------------+---------------------------+-----------------
    (0 rows)
    
    dbadmin=> SELECT current_value, default_value, description FROM configuration_parameters WHERE parameter_name = 'CopyFaultTolerantExpressions';
     current_value | default_value |                                      description
    ---------------+---------------+---------------------------------------------------------------------------------------
     0             | 0             | Expression evaluation erros during COPY are rejected instead of rolling back the COPY
    (1 row)
    
    dbadmin=> ALTER SESSION SET CopyFaultTolerantExpressions = 1;
    ALTER SESSION
    
    dbadmin=> COPY CUST_CONTACT
    dbadmin-> (
    dbadmin(> Name,
    dbadmin(> v2 filler VARCHAR(15),
    dbadmin(> Contact as REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3')
    dbadmin(> ) from '/home/dbadmin/Test5.csv' skip 1 delimiter ','
    dbadmin-> rejected data as table errortest3;
     Rows Loaded
    -------------
               1
    (1 row)
    
    dbadmin=> SELECT * FROM errortest3;
         node_name      |        file_name        |            session_id             |  transaction_id   | statement_id | batch_number | row_number |   rejected_data   | rejected_data_orig_length |                                    rejected_reason           
    --------------------+-------------------------+-----------------------------------+-------------------+--------------+--------------+------------+-------------------+---------------------------+---------------------------------------------------------------------------------------
     v_test_db_node0001 | /home/dbadmin/Test5.csv | v_test_db_node0001-292870:0x24805 | 45035996274662197 |            1 |            0 |          1 | Tuple (TEST,Null) |                        17 | ERROR 2501:  Cannot set a NOT NULL column (Contact) to a NULL value in COPY statement
    (1 row)
    

    See:
    https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/CapturingLoadExceptionsAndRejections.htm

  • haroonharoon Registered User

    The solution you provided is for version 9 , is there any solution for version Vertica Analytic Database v8.1.1-0?

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited December 2

    Maybe just insert a dummy "NULL" value temporarily... then remove those records when the load is complete?

    Example:

    dbadmin=> TRUNCATE TABLE CUST_CONTACT;
    TRUNCATE TABLE
    
    dbadmin=> \! cat /home/dbadmin/Test5.csv
    HEADER1,HEADER2
    TEST,
    TEST,TEST
    
    dbadmin=> COPY CUST_CONTACT
    dbadmin-> (
    dbadmin(> Name,
    dbadmin(> v2 filler VARCHAR(15),
    dbadmin(> Contact as NVL(REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3'), 'NULL')
    dbadmin(> )
    dbadmin-> from '/home/dbadmin/Test5.csv' skip 1 delimiter ','
    dbadmin-> rejected data as table errortest3;
     Rows Loaded
    -------------
               2
    (1 row)
    
    dbadmin=> CREATE TABLE CUST_CONTACT_NULLS AS
    dbadmin->   SELECT *
    dbadmin->     FROM CUST_CONTACT
    dbadmin->    WHERE Contact = 'NULL';
    CREATE TABLE
    
    dbadmin=> DELETE
    dbadmin->   FROM CUST_CONTACT
    dbadmin->  WHERE Contact = 'NULL';
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> SELECT * FROM CUST_CONTACT;
     Name | Contact
    ------+---------
     TEST | TEST
    (1 row)
    
    dbadmin=> SELECT * FROM CUST_CONTACT_NULLS;
     Name | Contact
    ------+---------
     TEST | NULL
    (1 row)
    
  • abdulrehman86abdulrehman86 Registered User

    Thanks a lot @Jim_Knicely. It was indeed helpful.

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited December 3

    Awesome!

    Consider upgrading to enjoy the benefits of all the great new functionality in Vertica 9.1 or 9.2 !!! ;)

    https://www.vertica.com/docs/ReleaseNotes/9.2.x/Vertica_9.2.x_Release_Notes.htm
    https://www.vertica.com/docs/ReleaseNotes/9.1.x/Vertica_9.1.x_Release_Notes.htm

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file