backup to S3 cleanup sometimes fails
Hello,
we are running Vertica 9.3.1-8 on Amazon Liunx AWS EC2 hosts. The backup to S3 vbr.py script occasionally encounters the following stack trace. Any idea whether the script can be made more resilient. I suspect there is some pushback in the rate of S3 API calls being sent to AWS, but i have not dug in the code.
2021-05-23 09:01:47 localhost vbr Old restore points to be removed: 20210501_200009
2021-05-23 09:29:45 localhost vbr Approximate bytes to copy: ... of .... total.
2021-05-23 09:45:42 localhost vbr Copying backup metadata.
2021-05-23 09:46:01 localhost vbr Finalizing backup.
2021-05-23 09:46:01 localhost vbr Location s3://.....: putting updated backup manifest.
2021-05-23 09:46:12 localhost vbr Deleting old restore points.
2021-05-23 09:47:22 localhost vsql /opt/vertica/bin/vsql -q -t -X -ddb -p5433 -Udbadmin -h....
2021-05-23 09:47:24 localhost vbr Traceback (most recent call last):
File "/opt/vertica/bin/vbr.py", line 10323, in main
if vbr_task.run():
File "/opt/vertica/bin/vbr.py", line 5150, in run
result = self._run()
File "/opt/vertica/bin/vbr.py", line 6559, in _run
Parallel.foreach(delete_backup_objects, self._distinct_backup_locations)
File "/opt/vertica/bin/vbr.py", line 10003, in foreach
cls.map(func, iterable, threads_num=threads_num)
File "/opt/vertica/bin/vbr.py", line 9997, in map
raise exc_info[0](err_msg).with_traceback(exc_info[2])
File "/opt/vertica/bin/vbr.py", line 9981, in map
if not thr.join(Parallel.WAIT_QUANTUM):
File "/opt/vertica/bin/vbr.py", line 9957, in join
return self._thr.join(*args, **kwargs)
File "/opt/vertica/bin/vbr.py", line 9932, in join
raise self._exc_info[0](err_msg).with_traceback(self._exc_info[2])
File "/opt/vertica/bin/vbr.py", line 9910, in run
super(Parallel.CancellableThread, self).run()
File "/opt/vertica/oss/python3/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/vertica/bin/vbr.py", line 9945, in do_run
self._result = func(val)
File "/opt/vertica/bin/vbr.py", line 6543, in delete_backup_objects
self._is_data_from_shared_storage())
File "/opt/vertica/bin/vbr.py", line 5664, in _delete_backup_objects
delete_fanout_dir_objects, objs_by_fanout_dir, threads_num=delete_concurrency)
File "/opt/vertica/bin/vbr.py", line 10003, in foreach
cls.map(func, iterable, threads_num=threads_num)
File "/opt/vertica/bin/vbr.py", line 9997, in map
raise exc_info[0](err_msg).with_traceback(exc_info[2])
File "/opt/vertica/bin/vbr.py", line 9981, in map
if not thr.join(Parallel.WAIT_QUANTUM):
File "/opt/vertica/bin/vbr.py", line 9957, in join
return self._thr.join(*args, **kwargs)
File "/opt/vertica/bin/vbr.py", line 9932, in join
raise self._exc_info[0](err_msg).with_traceback(self._exc_info[2])
File "/opt/vertica/bin/vbr.py", line 9910, in run
super(Parallel.CancellableThread, self).run()
File "/opt/vertica/oss/python3/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/vertica/bin/vbr.py", line 9945, in do_run
self._result = func(val)
File "/opt/vertica/bin/vbr.py", line 5661, in delete_fanout_dir_objects
backup_access.delete_many('', obj_paths)
File "/opt/vertica/bin/vbr.py", line 3536, in delete_many
self._delete_many(remote_prefix, file_list)
File "/opt/vertica/bin/vbr.py", line 3837, in _delete_many
Parallel.map(delete_file_list, file_lists, threads_num=self.DELETE_THREADS_NUM)
File "/opt/vertica/bin/vbr.py", line 9997, in map
raise exc_info[0](err_msg).with_traceback(exc_info[2])
File "/opt/vertica/bin/vbr.py", line 9981, in map
if not thr.join(Parallel.WAIT_QUANTUM):
File "/opt/vertica/bin/vbr.py", line 9957, in join
return self._thr.join(*args, **kwargs)
File "/opt/vertica/bin/vbr.py", line 9932, in join
raise self._exc_info[0](err_msg).with_traceback(self._exc_info[2])
TypeError: __init__() missing 1 required positional argument: 'operation_name'
Thank you.
0

Answers
Sounds like something we fixed in VER-72948, 9.3.1-12 and above. See: https://www.vertica.com/docs/ReleaseNotes/9.3.x/Vertica_9.3.x_Release_Notes.htm#9.3.1-12
very nice, thanks Lenoy. We will have to upgrade then.