Handling Expression Errors in COPY Statements
Jim_Knicely
- Select Field - Administrator
We can transform data as it is loaded via a COPY command using a FILLER column and a COPY expression. In the example below a COPY expression uses the LOWER function to convert uppercase text to lowercase as its loaded.
Example:
dbadmin=> CREATE TABLE test (a INT, b VARCHAR(10));
CREATE TABLE
dbadmin=> \! cat /home/dbadmin/test.txt
1|TEST1
2|
3|TEST3
A|TEST4
dbadmin=> COPY test (a, b)
dbadmin-> FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad;
Rows Loaded
-------------
3
(1 row)
dbadmin=> SELECT * FROM test;
a | b
---+-------
1 | TEST1
2 |
3 | TEST3
(3 rows)
dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad;
rejected_data | rejected_reason
---------------+---------------------------------------------
A|TEST4 | Invalid integer format 'A' for column 1 (a)
(1 row)
That worked as expected.
But what happens when an error occurs because of the COPY expression? Let’s find out.
dbadmin=> TRUNCATE TABLE test; TRUNCATE TABLE dbadmin=> DROP TABLE test_bad; DROP TABLE dbadmin=> ALTER TABLE test ALTER COLUMN b SET NOT NULL; ALTER TABLE dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS LOWER(b_filler)) dbadmin-> FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad; ERROR 2501: Cannot set a NOT NULL column (b) to a NULL value in COPY statement dbadmin=> SELECT * FROM test; a | b ---+--- (0 rows)
Yikes! No rows were loaded because the COPY command was aborted!
If you’d rather have the COPY command report record rejections during transformations and proceed, set the CopyFaultTolerantExpressions parameter to 1 (TRUE).
dbadmin=> ALTER SESSION SET CopyFaultTolerantExpressions = 1;
ALTER SESSION
dbadmin=> COPY test (a, b_filler FILLER VARCHAR(10), b AS lower(b_filler))
dbadmin-> FROM '/home/dbadmin/test.txt' REJECTED DATA TABLE test_bad;
Rows Loaded
-------------
2
(1 row)
dbadmin=> SELECT * FROM test;
a | b
---+-------
1 | test1
3 | test3
(2 rows)
dbadmin=> SELECT rejected_data, rejected_reason FROM test_bad;
rejected_data | rejected_reason
----------------+---------------------------------------------------------------------------------
A|TEST4 | Invalid integer format 'A' for column 1 (a)
Tuple (2,Null) | ERROR 2501: Cannot set a NOT NULL column (b) to a NULL value in COPY statement
(2 rows)
That’s better!
Helpful Links:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/TransformingDataDuringLoads.htm
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/GeneralParameters.htm
Have fun!