Purge_Table operations taking long time.

esangrame · July 2019

Hi,

Target table : 60 Billion records
Stage table : 800 Million records.

Every time we merge Stage to Target a lot of delete vectors are created.
So we purge the table before exiting out of the script.

Purge table takes long time to successfully execute and come out.
By that time next batch of record arrives and the earlier Purge_table operation is not completed, so the next merge fails with

timeout error Timed out T locking Table: Target_table. X held by [user:

is there a way to fasten the purge_table operation.
Please suggest a way to overcome this scenario.

Thanks

esangrame · July 2019

Adding to above.
Data in Stage table do not belong to any specific group i.e we can't bundle it to one partition so was not to use swap partitions.

marcothesane · July 2019

Hi esangrame -
With 60 billion records in an unpartitioned target table, or a target table partitioned by something else than time, I don't think you can change that behaviour using MERGE , then PURGE_TABLE() . After all, it's a complete MERGEOUT operation that you're triggering with the SELECT PURGE_TABLE() command.

Workarounds:

increase the delays between the batches to accommodate the necessary time you measured for the PURGE_TABLE().
Add a column cre_upd_ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, and re-partition the table by YEAR(cre_upd_ts)*100+MONTH(cre_upd_ts). In practically all ETL scenarios, once a row is older than a few months, it will never be touched again. So you would be safe to just maintain the last few months and then swap partitions. If the partition scheme you currently have is precious, then find an expression that combines the "yearmonth" mentioned above with the original partitioning expression, so that you do get a subset of partitions to work with.

Good luck ...

We're Moving!

Create My New Community Account Now

Purge_Table operations taking long time.

Comments

Leave a Comment