Restoring to an old snapshot takes more storage

amosmoss · March 2017

Hi,

Recently, my node's disks filled up. To handle that I needed to optimized a couple of fact table superprojections.

The problem is that I have to duplicate the data to create the new projection. Since I have not enough space left, I needed to free some space.

To do that, I decided to restore the DB to its state two weeks ago, when it was less full. Then I would replace the projections, and reload the last two weeks of data. The backup is located in a remote server (mounted as a local directory) and is incrementally backed up daily.

To my surprise, Instead of deleting the newest files, and freeing space immediately, the file system just got more congested, got to 100% full and failed.

How is that possible? Am I doing something wrong?

Thanks,
Amos.

skeswani · May 2017

Yes, 
It can happen, for two reasons that interact with each other.

1. Tuple Mover Activity may result in new files getting created 
2. when we copy the snapshot back we first copy the files from the snapshot, and only then delete the files that were created by the Tuple Mover.

This can result in a short duration of time where both files coexists. the  ones from the older snapshot and the ones created by the Tuple Mover after the snapshot was taken.


see example below

1. Say I have a database that takes 428K space on disk

[skeswani@skeswani] [@:/]$ du -sh /scratch_b/verticadb/test/v_test_node0001_data/
428K    /scratch_b/verticadb/test/v_test_node0001_data/

2. I back it up

[skeswani@skeswani] [@:/]$ /opt/vertica/bin/vbr.py --task backup --config backup_snapshot.ini 
Starting backup of database test.
Participating nodes: v_test_node0001.
Snapshotting database.
Snapshot complete.
Approximate bytes to copy: 244032189 of 244032189 total.
[==================================================] 100%
Copying backup metadata.
Finalizing backup.
Backup complete!


3. Now add some data after the backup

skeswani=> insert /*+direct*/ into t select * from t; commit;
 OUTPUT 
 --------
 100000
(1 row)

4. Now the database is 620K on disk, it has 2 files one is the initial data in the table and the other the result of the above insert

[skeswani@skeswani] [@:/]$ du -sh /scratch_b/verticadb/test/v_test_node0001_data/
620K    /scratch_b/verticadb/test/v_test_node0001_data/

[skeswani@skeswani] [@:/]$ find /scratch_b/verticadb/test/v_test_node0001_data/ -name *.gt
/scratch_b/verticadb/test/v_test_node0001_data/731/028d04fcbcc5c6f1090c842bf744893800a0000000002243_0.gt
/scratch_b/verticadb/test/v_test_node0001_data/713/02edfdb8e99026fdaef7c4ac657be25a00a0000000002231_0.gt

5. now merge the data that was inserted with the old data (we do this manually now, but this is a background Tuple Mover task in vertica that occurs with time and data)

skeswani=> select do_tm_task('MergeOut'); 
                           do_tm_task                           
 ----------------------------------------------------------------
 Task: mergeout
(Table: public.t) (Projection: public.t_super)
(1 row)

So now there are three files, (a) initial data in the table, (b) insert we did, (c) TM created a merged file 

[skeswani@skeswani] [@:/]$ du -sh /scratch_b/verticadb/test/v_test_node0001_data/
1.1M    /scratch_b/verticadb/test/v_test_node0001_data/

[skeswani@skeswani] [@:/]$ find /scratch_b/verticadb/test/v_test_node0001_data/ -name *.gt
/scratch_b/verticadb/test/v_test_node0001_data/731/028d04fcbcc5c6f1090c842bf744893800a0000000002243_0.gt
/scratch_b/verticadb/test/v_test_node0001_data/739/028d04fcbcc5c6f1090c842bf744893800a000000000224b_0.gt
/scratch_b/verticadb/test/v_test_node0001_data/713/02edfdb8e99026fdaef7c4ac657be25a00a0000000002231_0.gt


6. Wait for the old files to get reaped (in about 2 min). Now we have only 1 file on disk and the database size is 656K

[skeswani@skeswani] [@:/]$ find /scratch_b/verticadb/test/v_test_node0001_data/ -name *.gt
/scratch_b/verticadb/test/v_test_node0001_data/739/028d04fcbcc5c6f1090c842bf744893800a000000000224b_0.gt
[skeswani@skeswani] [@:/]$ du -sh /scratch_b/verticadb/test/v_test_node0001_data/
656K    /scratch_b/verticadb/test/v_test_node0001_data/


7. Now restore an older snapshot, 
      I stop the database,  
      delete the catalog
      and restore the older snapshot

skeswani=> select shutdown();
          shutdown          
 ----------------------------
 Shutdown: moveout complete
 (1 row)


[skeswani@skeswani] [@:/]$ rm -rf /scratch_b/verticadb/test/v_test_node0001_catalog/*

[skeswani@skeswani] [@:/]$ /opt/vertica/bin/vbr.py --task restore --archive 20170515_162940 --config backup_snapshot.ini
Starting full restore of database test.
Participating nodes: v_test_node0001.
Restoring from restore point: backup_snapshot_20170515_162940
Determining what data to restore from backup.
[==================================================] 100%
Approximate bytes to copy: 244031578 of 244032189 total.
Syncing data from backup to cluster nodes.
[==================================================] 100%
Restoring catalog.
Restore complete!

8. Note that the data on disk is as before. So it works as expect - right. 
 (not quite if you have used most of your disk space)

[skeswani@skeswani] [@:/]$ du -sh /scratch_b/verticadb/test/v_test_node0001_data/
428K    /scratch_b/verticadb/test/v_test_node0001_data/


9. However if you really closely, 
 (by running a watch program on the data directory)

We first create the files that belong to the snapshot and then we delete the files that dont. during this time you can have both old and new files in the data volume. 
Hence if you are running close to the 100% on disk usage, your restore may fail.

skeswani@skeswani ~] [@:/]$ inotifywait -m -r --format '%e %f' -e delete,create /scratch_b/verticadb/test/v_test_node0001_data/
Setting up watches.  Beware: since -r was given, this may take a while!
Watches established.
CREATE .02edfdb8e99026fdaef7c4ac657be25a00a0000000002231_0.gt6GLkNs   <== create first
DELETE 028d04fcbcc5c6f1090c842bf744893800a000000000224b_0.gt          <== delete later


Hope this answers your question.

- Sumeet

skeswani · May 2017

Off-course, the act of dropping the database would delete the files first.
We leave that for the user to do manually, if there are special circumstances like disks with little or no remaining space.

amosmoss · May 2017

Thank you skeswani for the detailed explaination.
)

Restoring to an old snapshot takes more storage

Comments

Leave a Comment