backup completed with errors and warning like "Cannot grab lock to create snapshot"

Sometime vertica backup scripts fails with below warning and error messages: Copying... 3016: vbr client subproc on 10.123.2.4 terminates with returncode 1. Details in vbr_v_aoe1p_node0004_client.log on that host. Error msg: rsync: send_files failed to open "/apps/vertica/AOE1P/v_aoe1p_node0004_catalog/Snapshots/PRDBackup/apps/vertica/AOE1P/v_aoe1p_node0004_data/365/58546795212320365/core": Permission denied (13) rsync: send_files failed to open "/apps/vertica/AOE1P/v_aoe1p_node0004_catalog/Snapshots/PRDBackup/apps/vertica/AOE1P/v_aoe1p_node0004_data/381/58546795212320381/core": Permission denied (13) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1042) [sender=3.0.7] rsync failed! Child processes terminated abnormally. backup failed! 3014: vbr client subproc on 10.123.2.3 terminates with returncode 255. Details in vbr_v_aoe1p_node0003_client.log on that host. Error msg: Killed by signal 2. 3012: vbr client subproc on 10.123.2.2 terminates with returncode 2. Details in vbr_v_aoe1p_node0002_client.log on that host. Error msg: cancelled by SIGINT WARNING: Node: v_aoe1p_node0001: Cannot grab lock to remove snapshot 'PRDBackup'. It might be used by others. 3018: vbr client subproc on 10.123.2.5 terminates with returncode 2. Details in vbr_v_aoe1p_node0005_client.log on that host. Error msg: cancelled by SIGINT WARNING: Node: v_aoe1p_node0001: Cannot grab lock to create snapshot 'PRDBackup'. It might be used by others. cleaning up... Retrying... #1 Copying... 274306082318 out of 274306082318, 100% All child processes terminated successfully. Committing changes on all backup sites... backup done! 2011-12-09_23:56:30 runbackup.sh ends

Comments

  • There is a known issue in 5.0 that when sometime backup fails if SnapShotRetentionTime is set to very low.It is probably set for 1 day. We suggest customers to increase SnapShotRetentionTime value to larger than 1 day. Like below we have set this for 7 days. vsql=> SELECT SET_CONFIG_PARAMETER ('SnapshotRetentionTime', 86400); In addition, the lock contention error may persist if there is an old backup (vbr.py) process still active. Check for such processes and stop them.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file