backup to S3 cleanup sometimes fails

Hello,
we are running Vertica 9.3.1-8 on Amazon Liunx AWS EC2 hosts. The backup to S3 vbr.py script occasionally encounters the following stack trace. Any idea whether the script can be made more resilient. I suspect there is some pushback in the rate of S3 API calls being sent to AWS, but i have not dug in the code.

2021-05-23 09:01:47 localhost vbr Old restore points to be removed: 20210501_200009
2021-05-23 09:29:45 localhost vbr Approximate bytes to copy: ... of .... total.
2021-05-23 09:45:42 localhost vbr Copying backup metadata.
2021-05-23 09:46:01 localhost vbr Finalizing backup.
2021-05-23 09:46:01 localhost vbr Location s3://.....: putting updated backup manifest.
2021-05-23 09:46:12 localhost vbr Deleting old restore points.
2021-05-23 09:47:22 localhost vsql /opt/vertica/bin/vsql -q -t -X -ddb -p5433 -Udbadmin -h....
2021-05-23 09:47:24 localhost vbr Traceback (most recent call last):
    File "/opt/vertica/bin/vbr.py", line 10323, in main
      if vbr_task.run():
    File "/opt/vertica/bin/vbr.py", line 5150, in run
      result = self._run()
    File "/opt/vertica/bin/vbr.py", line 6559, in _run
      Parallel.foreach(delete_backup_objects, self._distinct_backup_locations)
    File "/opt/vertica/bin/vbr.py", line 10003, in foreach
      cls.map(func, iterable, threads_num=threads_num)
    File "/opt/vertica/bin/vbr.py", line 9997, in map
      raise exc_info[0](err_msg).with_traceback(exc_info[2])
    File "/opt/vertica/bin/vbr.py", line 9981, in map
      if not thr.join(Parallel.WAIT_QUANTUM):
    File "/opt/vertica/bin/vbr.py", line 9957, in join
      return self._thr.join(*args, **kwargs)
    File "/opt/vertica/bin/vbr.py", line 9932, in join
      raise self._exc_info[0](err_msg).with_traceback(self._exc_info[2])
    File "/opt/vertica/bin/vbr.py", line 9910, in run
      super(Parallel.CancellableThread, self).run()
    File "/opt/vertica/oss/python3/lib/python3.7/threading.py", line 870, in run
      self._target(*self._args, **self._kwargs)
    File "/opt/vertica/bin/vbr.py", line 9945, in do_run
      self._result = func(val)
    File "/opt/vertica/bin/vbr.py", line 6543, in delete_backup_objects
      self._is_data_from_shared_storage())
    File "/opt/vertica/bin/vbr.py", line 5664, in _delete_backup_objects
      delete_fanout_dir_objects, objs_by_fanout_dir, threads_num=delete_concurrency)
    File "/opt/vertica/bin/vbr.py", line 10003, in foreach
      cls.map(func, iterable, threads_num=threads_num)
    File "/opt/vertica/bin/vbr.py", line 9997, in map
      raise exc_info[0](err_msg).with_traceback(exc_info[2])
    File "/opt/vertica/bin/vbr.py", line 9981, in map
      if not thr.join(Parallel.WAIT_QUANTUM):
    File "/opt/vertica/bin/vbr.py", line 9957, in join
      return self._thr.join(*args, **kwargs)
    File "/opt/vertica/bin/vbr.py", line 9932, in join
      raise self._exc_info[0](err_msg).with_traceback(self._exc_info[2])
    File "/opt/vertica/bin/vbr.py", line 9910, in run
      super(Parallel.CancellableThread, self).run()
    File "/opt/vertica/oss/python3/lib/python3.7/threading.py", line 870, in run
      self._target(*self._args, **self._kwargs)
    File "/opt/vertica/bin/vbr.py", line 9945, in do_run
      self._result = func(val)
    File "/opt/vertica/bin/vbr.py", line 5661, in delete_fanout_dir_objects
      backup_access.delete_many('', obj_paths)
    File "/opt/vertica/bin/vbr.py", line 3536, in delete_many
      self._delete_many(remote_prefix, file_list)
    File "/opt/vertica/bin/vbr.py", line 3837, in _delete_many
      Parallel.map(delete_file_list, file_lists, threads_num=self.DELETE_THREADS_NUM)
    File "/opt/vertica/bin/vbr.py", line 9997, in map
      raise exc_info[0](err_msg).with_traceback(exc_info[2])
    File "/opt/vertica/bin/vbr.py", line 9981, in map
      if not thr.join(Parallel.WAIT_QUANTUM):
    File "/opt/vertica/bin/vbr.py", line 9957, in join
      return self._thr.join(*args, **kwargs)
    File "/opt/vertica/bin/vbr.py", line 9932, in join
      raise self._exc_info[0](err_msg).with_traceback(self._exc_info[2])
  TypeError: __init__() missing 1 required positional argument: 'operation_name'

Thank you.

Tagged:

Answers

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file