Handling Expression Errors in COPY Statements

Jim_KnicelyJim_Knicely - Select Field - Administrator

We can transform data as it is loaded via a COPY command using a FILLER column and a COPY expression. In the example below a COPY expression uses the LOWER function to convert uppercase text to lowercase as its loaded.

Example:

dbadmin=> CREATE TABLE test (a INT, b VARCHAR(10));
CREATE TABLE

dbadmin=> \! cat /home/dbadmin/test.txt
1|TEST1
2|
3|TEST3
A|TEST4

dbadmin=> COPY test (a, b)
dbadmin->   FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad;
Rows Loaded
-------------
           3
(1 row)

dbadmin=> SELECT * FROM test;
a |   b
---+-------
1 | TEST1
2 |
3 | TEST3
(3 rows)

dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad;
rejected_data |               rejected_reason
---------------+---------------------------------------------
A|TEST4       | Invalid integer format 'A' for column 1 (a)
(1 row)

That worked as expected.

But what happens when an error occurs because of the COPY expression? Let’s find out.

dbadmin=> TRUNCATE TABLE test;
TRUNCATE TABLE

dbadmin=> DROP TABLE test_bad;
DROP TABLE

dbadmin=> ALTER TABLE test ALTER COLUMN b SET NOT NULL;
ALTER TABLE

dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS LOWER(b_filler))
dbadmin->   FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad;
ERROR 2501:  Cannot set a NOT NULL column (b) to a NULL value in COPY statement

dbadmin=> SELECT * FROM test;
a | b
---+---
(0 rows)

Yikes! No rows were loaded because the COPY command was aborted!

If you’d rather have the COPY command report record rejections during transformations and proceed, set the CopyFaultTolerantExpressions parameter to 1 (TRUE).

dbadmin=> ALTER SESSION SET CopyFaultTolerantExpressions = 1;
ALTER SESSION

dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS lower(b_filler))
dbadmin->   FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad;
Rows Loaded
-------------
           2
(1 row)

dbadmin=> SELECT * FROM test;
a |   b
---+-------
1 | test1
3 | test3
(2 rows)

dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad;
rejected_data  |                                 rejected_reason
----------------+---------------------------------------------------------------------------------
A|TEST4        | Invalid integer format 'A' for column 1 (a)
Tuple (2,Null) | ERROR 2501:  Cannot set a NOT NULL column (b) to a NULL value in COPY statement
(2 rows)

That’s better!

Helpful Links:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/TransformingDataDuringLoads.htm
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/GeneralParameters.htm

Have fun!

Sign In or Register to comment.