Options

"duplicate merge key detected in join" error on tables without keys

I’m trying to use MERGE to upsert aggregate data into a collection of tables.  These tables each have a set of fields and a set of metrics.  For each table some fields are nullable and some are not.  No keys are declared on the tables.  I’m using temp tables, which match the target tables, to hold the aggregate data, which is then consumed by the MERGE statement.  When I run the MERGE statements I receive a “Duplicate MERGE key detected in join” error, which shows that it is treating only the not null columns as a key despite my ON clause including syntax to include the null columns in the match.  

Sample pseudo syntax:
CREATE TABLE target (
fieldA varchar not null,
fieldB varchar not null,
fieldC varchar null,
metricA int,
metricB int);

CREATE TABLE temp (
fieldA varchar not null,
fieldB varchar not null,
fieldC varchar null,
metricA int,
metricB int);

MERGE INTO target trg
USING temp tmp ON (
    trg.FieldA = tmp.FieldA
    AND trg.FieldB = tmp.FieldB
    AND ((trg.FieldC IS NULL AND tmp.FieldC IS NULL) OR trg.FieldC = tmp.FieldC))
WHEN MATCHED THE UPDATE
    SET trg.MetricA = tmp.MetricA ,
    trg.MetricB = tmp.MetricB
WHEN NOT MATCHED THEN INSERT (
    FieldA ,
    FieldB ,
    FieldC ,
    MetricA ,
    MetricB )
VALUES (
    tmp.FieldA ,
    tmp.FieldB ,
    tmp.FieldC ,
    tmp.MetricA ,
    tmp.MetricB );




Comments

  • Options
    Hi David,
    this problem is caused by multiple matches between source and target table, meaning: the MERGE CONDITION gives more than one single row.

    try by modifying the MERGE CONDITION with:

    ..   AND ((trg.FieldC IS NULL AND tmp.FieldC IS NULL) OR (trg.FieldC = tmp.FieldC AND trg.FieldC IS NOT NULL)))

    ;)

  • Options
    Hey Pietro,

    I tried what you suggested but it did not work.  The odd thing is the sample code I provided seems to work fine.  But in the real world case, with a number of null and not null columns, I still get the error.  Will attempt to scale up the example next to see if there is a tipping point.

    Thanks,
    Dave
  • Options
    After doing some testing it appears that if we use ISNULL(trg.FieldC, -1) = ISNULL(tmp.FieldC, -1), we can fake out Vertica and it works.  
  • Options
    I remember that also for Slqserver the trick is valid....
  • Options
    If you have a lot of columns, you should merge by a hash column.
    Basically, you create an extra column and every time you insert or update a row you put HASH(FieldA, FieldB, ...) into the hash column.
    Merging through a hash column is much faster for large tables and you get rid of the issue you've described.
  • Options
    great this works, thanks for sharing
  • Options

    Hash should be added on the source table? unfortunately I am trying to merge data directly from a oracle table(as EXTERNAL) and we can not do that. So even it throw, is it merging all other rows except that row?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file