COPY Statement not rejecting records when FILLER is being used
Hi,
We are reading data from a CSV and loading it into a vertica table using COPY command. The target table has a column set to NOT NULL. In COPY statement, we are using FILLER to transform data. This transformed data is then loaded into NOT NULL column. If there is null in CSV, the query fails with error like 'Cannot insert into NOT NULL column'.
However when we remove the FILLER and run the query, it sends the NULL value row to rejected table.
Can we have the same functionality while using FILLER in COPY query?
Below are the queries
CREATE TABLE CUST_CONTACT
(
Name varchar(80),
Contact varchar(15) NOT NULL
);
COPY CUST_CONTACT
(
Name,
v2 filler VARCHAR(15),
Contact as REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3')
) from 'Test5.csv' skip 1 delimiter ','
rejected data as table errortest3;
Thanks.
Answers
Set the CopyFaultTolerantExpressions parameter to 1!
Example:
dbadmin=> \! cat /home/dbadmin/Test5.csv TEST,TEST TEST, dbadmin=> COPY CUST_CONTACT dbadmin-> ( dbadmin(> Name, dbadmin(> v2 filler VARCHAR(15), dbadmin(> Contact as REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3') dbadmin(> ) from '/home/dbadmin/Test5.csv' skip 1 delimiter ',' dbadmin-> rejected data as table errortest3; ERROR 2501: Cannot set a NOT NULL column (Contact) to a NULL value in COPY statement dbadmin=> select * from errortest3; node_name | file_name | session_id | transaction_id | statement_id | batch_number | row_number | rejected_data | rejected_data_orig_length | rejected_reason -----------+-----------+------------+----------------+--------------+--------------+------------+---------------+---------------------------+----------------- (0 rows) dbadmin=> SELECT current_value, default_value, description FROM configuration_parameters WHERE parameter_name = 'CopyFaultTolerantExpressions'; current_value | default_value | description ---------------+---------------+--------------------------------------------------------------------------------------- 0 | 0 | Expression evaluation erros during COPY are rejected instead of rolling back the COPY (1 row) dbadmin=> ALTER SESSION SET CopyFaultTolerantExpressions = 1; ALTER SESSION dbadmin=> COPY CUST_CONTACT dbadmin-> ( dbadmin(> Name, dbadmin(> v2 filler VARCHAR(15), dbadmin(> Contact as REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3') dbadmin(> ) from '/home/dbadmin/Test5.csv' skip 1 delimiter ',' dbadmin-> rejected data as table errortest3; Rows Loaded ------------- 1 (1 row) dbadmin=> SELECT * FROM errortest3; node_name | file_name | session_id | transaction_id | statement_id | batch_number | row_number | rejected_data | rejected_data_orig_length | rejected_reason --------------------+-------------------------+-----------------------------------+-------------------+--------------+--------------+------------+-------------------+---------------------------+--------------------------------------------------------------------------------------- v_test_db_node0001 | /home/dbadmin/Test5.csv | v_test_db_node0001-292870:0x24805 | 45035996274662197 | 1 | 0 | 1 | Tuple (TEST,Null) | 17 | ERROR 2501: Cannot set a NOT NULL column (Contact) to a NULL value in COPY statement (1 row)See:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/CapturingLoadExceptionsAndRejections.htm
The solution you provided is for version 9 , is there any solution for version Vertica Analytic Database v8.1.1-0?
Maybe just insert a dummy "NULL" value temporarily... then remove those records when the load is complete?
Example:
dbadmin=> TRUNCATE TABLE CUST_CONTACT; TRUNCATE TABLE dbadmin=> \! cat /home/dbadmin/Test5.csv HEADER1,HEADER2 TEST, TEST,TEST dbadmin=> COPY CUST_CONTACT dbadmin-> ( dbadmin(> Name, dbadmin(> v2 filler VARCHAR(15), dbadmin(> Contact as NVL(REGEXP_REPLACE(v2, '(\d{5})(\d{7})(\d{1})','\1-\2-\3'), 'NULL') dbadmin(> ) dbadmin-> from '/home/dbadmin/Test5.csv' skip 1 delimiter ',' dbadmin-> rejected data as table errortest3; Rows Loaded ------------- 2 (1 row) dbadmin=> CREATE TABLE CUST_CONTACT_NULLS AS dbadmin-> SELECT * dbadmin-> FROM CUST_CONTACT dbadmin-> WHERE Contact = 'NULL'; CREATE TABLE dbadmin=> DELETE dbadmin-> FROM CUST_CONTACT dbadmin-> WHERE Contact = 'NULL'; OUTPUT -------- 1 (1 row) dbadmin=> SELECT * FROM CUST_CONTACT; Name | Contact ------+--------- TEST | TEST (1 row) dbadmin=> SELECT * FROM CUST_CONTACT_NULLS; Name | Contact ------+--------- TEST | NULL (1 row)Thanks a lot @Jim_Knicely. It was indeed helpful.
Awesome!
Consider upgrading to enjoy the benefits of all the great new functionality in Vertica 9.1 or 9.2 !!!
https://www.vertica.com/docs/ReleaseNotes/9.2.x/Vertica_9.2.x_Release_Notes.htm
https://www.vertica.com/docs/ReleaseNotes/9.1.x/Vertica_9.1.x_Release_Notes.htm