Logically right, but "strange" MERGE behavior.

Hello,

Let's suppose that we have an empty super projection with four fields: dateTime, bannerId, geo, count. We're going to bulk import some data from external CSV file using MERGE.

Here is example of such CSV file (dateTime,bannerId,geo,impressions):
2014-01-01,1,US,50
2014-01-01,1,US,10
2014-01-01,1,DE,75
Please pay attention to the first two lines of CSV file. First two lines have equal "key" (dateTime,bannerId,geo) - this is very important!

Here is piece of SQL to make MERGE (taken from http://www.vertica.com/2012/08/27/load-new-data-and-modify-data-simultaneously/):
MERGE INTO impressions tgt
USING new_impressions src
       ON src.dateTime = tgt.dateTime
       AND src.bannerId = tgt.bannerId
       AND src.geo = tgt.geo
WHEN MATCHED THEN
    UPDATE SET count = tgt.count + src.count
WHEN NOT MATCHED THEN
    INSERT VALUES (src.dateTime, src.bannerId, src.geo, src.count);
 
Now in our super projection we have three lines:
2014-01-01,1,US,50
2014-01-01,1,US,10
2014-01-01,1,DE,75
But first two of them have similar "key" (dateTime,bannerId,geo) despite the fact we were using "WHEN MATCHED THEN UPDATE SET count = tgt.count + src.count" in our MERGE query.

Is there any way to get only 2 lines as result of MERGEing without pre MERGE CSV file processing?
2014-01-01,1,US,60 (sum of lines with equal keys)
2014-01-01,1,US,10
Thank you!

Comments

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file