Effective way to bulk update with Merge
I have a table containing sales data for last three years (hundreds of millions of rows). Source data gets minor changes every day in depth of last one-two months (error corrections) and I need to keep my vertica table in actual state. The idea is to update data for last two months every morning, using Merge, to keep data correct.
Maybe I'm mistaking about how Merge does its work, but as I understand it will delete ALL rows of period, despite of were changes occured or not, and perform full insert after. It doesn't look healthy strategy to rewrite (delete/insert) millions of the same rows every day just to actualize data in two of three rows.
In SQL Server i used "update only if changed hash on row" strategy, which includes adding hash column in the table, and merging (updating) on condition of inequality of hashes.
Is there any sense to do merge-on-inequal hash in Vertica? Adding such condition turns off optimized merge. Or it would be unnecesary overcomplicating of things? What is a common practice in such kind of updates in Vertica?