The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!
Logically right, but "strange" MERGE behavior.
Hello,
Let's suppose that we have an empty super projection with four fields: dateTime, bannerId, geo, count. We're going to bulk import some data from external CSV file using MERGE.
Here is example of such CSV file (dateTime,bannerId,geo,impressions):
Here is piece of SQL to make MERGE (taken from http://www.vertica.com/2012/08/27/load-new-data-and-modify-data-simultaneously/):
Now in our super projection we have three lines:
Is there any way to get only 2 lines as result of MERGEing without pre MERGE CSV file processing?
Let's suppose that we have an empty super projection with four fields: dateTime, bannerId, geo, count. We're going to bulk import some data from external CSV file using MERGE.
Here is example of such CSV file (dateTime,bannerId,geo,impressions):
2014-01-01,1,US,50Please pay attention to the first two lines of CSV file. First two lines have equal "key" (dateTime,bannerId,geo) - this is very important!
2014-01-01,1,US,10
2014-01-01,1,DE,75
Here is piece of SQL to make MERGE (taken from http://www.vertica.com/2012/08/27/load-new-data-and-modify-data-simultaneously/):
MERGE INTO impressions tgt
USING new_impressions src
ON src.dateTime = tgt.dateTime
AND src.bannerId = tgt.bannerId
AND src.geo = tgt.geo
WHEN MATCHED THEN
UPDATE SET count = tgt.count + src.count
WHEN NOT MATCHED THEN
INSERT VALUES (src.dateTime, src.bannerId, src.geo, src.count);
Now in our super projection we have three lines:
2014-01-01,1,US,50But first two of them have similar "key" (dateTime,bannerId,geo) despite the fact we were using "WHEN MATCHED THEN UPDATE SET count = tgt.count + src.count" in our MERGE query.
2014-01-01,1,US,10
2014-01-01,1,DE,75
Is there any way to get only 2 lines as result of MERGEing without pre MERGE CSV file processing?
2014-01-01,1,US,60 (sum of lines with equal keys)Thank you!
2014-01-01,1,US,10
0
Comments
If you are expecting sum of lines with equal keys, then you should use a group by clause and sum the count.
Other then that I don't think the behavior of Merge is "Strange".
Can you show what is ddl for impressions and new_impressions and what is the data inside them referring to this example
Short answer: No!
Longer answer: No, you can't achieve it with MERGE.
https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/MERGE.htm
Notes and Restrictions
- An optimized query plan for a
...MERGE
statement assumes that the data you are trying to merge conforms with any unique or primary key constraints you declare.PS
MERGE should fail on dups or you defined DISABLE_DUPLICATE_KEY_ERROR, so no dups detection. So logically and practically its a right MERGE.
https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/VerticaFunct...
https://community.vertica.com/vertica/topics/_duplicate_merge_key_detected_in_join_error_on_tables_w...
select dateTime, bannerId, geo, count, sum(count) OVER (PARTITION BY dateTime, bannerId, geo) as dbgsumcount from <table> ;
select dateTime, bannerId, geo, count from <table> GROUP BY dateTime, bannerId, geo;