Why MERGE does not consider null values "properly" ?
Hi All,
I am trying to make a MERGE which involves columns with NULL values. Here a simple example:
CREATE TABLE test_merge_dst AS SELECT 0 AS id, 'Foo' AS name, 'Bar' AS surname1, NULL AS surname2 CREATE TABLE test_merge_src AS SELECT 0 AS id, 'Foo' AS name, 'Bar' AS surname1, NULL AS surname2 MERGE INTO test_merge_dst AS dst USING test_merge_src src ON (src.id = dst.id AND src.name = dst.name AND src.surname1 = dst.surname1 AND src.surname2 = dst.surname2) WHEN NOT MATCHED THEN INSERT (id, name, surname1, surname2) VALUES (src.id, src.name, src.surname1, src.surname2)
The expected result from this merge is NO merge, since I am merging the same values for a single row. The result is this:
id | name | surname1 | surname2 |
---|---|---|---|
0 | Foo | Bar | [NULL] |
0 | Foo | Bar | [NULL] |
Why does this happen ?, I've been reading the documentation but no reference to NULL values are found.
Regards
Best Answer
-
mosheg Vertica Employee Administrator
The issue with the MERGE statement is that it involves columns with NULL values. When comparing NULL values with the "=" operator, it returns NULL instead of true, which causes the MERGE to not match the rows as expected.
To compare nullable values, you can use the COALESCE function to convert the NULL value, which allows the comparison to return a true or false value.To fix the issue, you can modify the MERGE statement to use the COALESCE function on the nullable columns, like this:
MERGE INTO test_merge_dst AS dst USING test_merge_src src ON (src.id = dst.id AND src.name = dst.name AND src.surname1 = dst.surname1 AND COALESCE(CAST(dst.surname2 AS VARCHAR), '') = COALESCE(src.surname2::VARCHAR, '') ) WHEN NOT MATCHED THEN INSERT (id, name, surname1, surname2) VALUES (src.id, src.name, src.surname1, src.surname2);
In addition, the latest Vertica versions support the following syntax:
merge into test_merge_dst as dst using test_merge_src as src on src.id = dst.id and src.name = dst.name and src.surname1 = dst.surname1 and src.surname2 <=> dst.surname2 when not matched then insert (id, name, surname1, surname2) values (src.id, src.name, src.surname1, src.surname2);
1