Handling Expression Errors in COPY Statements
We can transform data as it is loaded via a COPY command using a FILLER column and a COPY expression. In the example below a COPY expression uses the LOWER function to convert uppercase text to lowercase as its loaded.
Example:
dbadmin=> CREATE TABLE test (a INT, b VARCHAR(10)); CREATE TABLE dbadmin=> \! cat /home/dbadmin/test.txt 1|TEST1 2| 3|TEST3 A|TEST4 dbadmin=> COPY test (a, b) dbadmin-> FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad; Rows Loaded ------------- 3 (1 row) dbadmin=> SELECT * FROM test; a | b ---+------- 1 | TEST1 2 | 3 | TEST3 (3 rows) dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad; rejected_data | rejected_reason ---------------+--------------------------------------------- A|TEST4 | Invalid integer format 'A' for column 1 (a) (1 row)
That worked as expected.
But what happens when an error occurs because of the COPY expression? Let’s find out.
dbadmin=> TRUNCATE TABLE test; TRUNCATE TABLE dbadmin=> DROP TABLE test_bad; DROP TABLE dbadmin=> ALTER TABLE test ALTER COLUMN b SET NOT NULL; ALTER TABLE dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS LOWER(b_filler)) dbadmin-> FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad; ERROR 2501: Cannot set a NOT NULL column (b) to a NULL value in COPY statement dbadmin=> SELECT * FROM test; a | b ---+--- (0 rows)
Yikes! No rows were loaded because the COPY command was aborted!
If you’d rather have the COPY command report record rejections during transformations and proceed, set the CopyFaultTolerantExpressions parameter to 1 (TRUE).
dbadmin=> ALTER SESSION SET CopyFaultTolerantExpressions = 1; ALTER SESSION dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS lower(b_filler)) dbadmin-> FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad; Rows Loaded ------------- 2 (1 row) dbadmin=> SELECT * FROM test; a | b ---+------- 1 | test1 3 | test3 (2 rows) dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad; rejected_data | rejected_reason ----------------+--------------------------------------------------------------------------------- A|TEST4 | Invalid integer format 'A' for column 1 (a) Tuple (2,Null) | ERROR 2501: Cannot set a NOT NULL column (b) to a NULL value in COPY statement (2 rows)
That’s better!
Helpful Links:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/TransformingDataDuringLoads.htm
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/GeneralParameters.htm
Have fun!