Race condition between two data loading scripts

Hello
I have one process that runs every minute and does COPY into a staging table of data from files.
I have a second process that runs once an hour that does the following:
SET SESSION AUTOCOMMIT TO OFF;
LOCK TABLE my_staging_table;
MERGE into full_table USING my_staging_table
TRUNCATE my_staging_table;
I dont have a way to synchronize these 2 processes. I see that some records are being lost. From the timestamps and analysis it looks like the COPY insert some data into the staging table at the same time as the TRUNCATE and the TRUNCATE happens AFTER the COPY.
I tried adding a LOCK TABLE my_staging_table before the COPY but it did not help.
How can I rewrite this so that I dont loose data? Maybe with table renaming?
Thanks,

Answers

  • moshegmosheg Vertica Employee Administrator

    Hi Brocha,
    As per our talk try to RENAME the table in the second process that runs once an hour.

  • Arumugaguru_MArumugaguru_M Vertica Employee

    Can you try running the second process statement with FOR UPDATE, something like this "MERGE into full_table USING my_staging_table FOR UPDATE" and see if that helps.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file