Purge_Table operations taking long time.
Hi,
Target table : 60 Billion records
Stage table : 800 Million records.
Every time we merge Stage to Target a lot of delete vectors are created.
So we purge the table before exiting out of the script.
Purge table takes long time to successfully execute and come out.
By that time next batch of record arrives and the earlier Purge_table operation is not completed, so the next merge fails with
timeout error Timed out T locking Table: Target_table. X held by [user:
is there a way to fasten the purge_table operation.
Please suggest a way to overcome this scenario.
Thanks
Tagged:
1
Comments
Adding to above.
Data in Stage table do not belong to any specific group i.e we can't bundle it to one partition so was not to use swap partitions.
Hi esangrame -
With 60 billion records in an unpartitioned target table, or a target table partitioned by something else than time, I don't think you can change that behaviour using MERGE , then PURGE_TABLE() . After all, it's a complete MERGEOUT operation that you're triggering with the
SELECT PURGE_TABLE()
command.Workarounds:
cre_upd_ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
, and re-partition the table byYEAR(cre_upd_ts)*100+MONTH(cre_upd_ts)
. In practically all ETL scenarios, once a row is older than a few months, it will never be touched again. So you would be safe to just maintain the last few months and then swap partitions. If the partition scheme you currently have is precious, then find an expression that combines the "yearmonth" mentioned above with the original partitioning expression, so that you do get a subset of partitions to work with.Good luck ...