Fail create S3 backup
Hi,
I want create full backup to S3, but get error:
2018-01-15 09:32:13 localhost vbr Error unlocking remote file: [Errno 111] Connection refused.
2018-01-15 09:32:13 localhost vbr Traceback (most recent call last):
File "/opt/vertica/bin/vbr", line 8971, in main
if vbr_task.run():
File "/opt/vertica/bin/vbr", line 4325, in run
return self._run()
File "/opt/vertica/bin/vbr", line 5403, in _run
Parallel.foreach(copy_node_objects, self._participating_nodes)
File "/opt/vertica/bin/vbr", line 8656, in foreach
cls.map(func, iterable, threads_num=threads_num)
File "/opt/vertica/bin/vbr", line 8635, in map
if not thr.join(Parallel.WAIT_QUANTUM):
File "/opt/vertica/bin/vbr", line 8611, in join
return self._thr.join(*args, **kwargs)
File "/opt/vertica/bin/vbr", line 8565, in run
super(Parallel.CancellableThread, self).run()
File "/opt/vertica/oss/python/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/opt/vertica/bin/vbr", line 8599, in do_run
self._result = func(val)
File "/opt/vertica/bin/vbr", line 5400, in copy_node_objects
obj.storage_id, obj.obj_type, obj.loc_id, obj.length)
File "/opt/vertica/bin/vbr", line 4040, in put_object
self._copy_obj(storage_id, obj_type, loc_id, length, dest_loc_id)
File "/opt/vertica/bin/vbr", line 3963, in _copy_obj
self._run_batch_if_ready(key, batch, idx)
File "/opt/vertica/bin/vbr", line 3977, in _run_batch_if_ready
self._run_work_thread(key, batch)
File "/opt/vertica/bin/vbr", line 3934, in _run_work_thread
thr.join()
File "/opt/vertica/bin/vbr", line 8565, in run
super(Parallel.CancellableThread, self).run()
File "/opt/vertica/oss/python/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/opt/vertica/bin/vbr", line 4018, in _copy_objs
storage_location, remote_storage_location, obj_info, worker_ix)
File "/opt/vertica/bin/vbr", line 4054, in _remote_copy_objs
storage_location, remote_storage_location, obj_info)
File "/opt/vertica/bin/vbr", line 943, in invoke
logs, result, error_msg = func(*args, **kw)
File "/opt/vertica/oss/python/lib/python2.7/xmlrpclib.py", line 1240, in __call__
return self.__send(self.__name, args)
File "/opt/vertica/oss/python/lib/python2.7/xmlrpclib.py", line 1599, in __request
verbose=self.__verbose
File "/opt/vertica/oss/python/lib/python2.7/xmlrpclib.py", line 1280, in request
return self.single_request(host, handler, request_body, verbose)
File "/opt/vertica/oss/python/lib/python2.7/xmlrpclib.py", line 1310, in single_request
response = h.getresponse(buffering=True)
File "/opt/vertica/oss/python/lib/python2.7/httplib.py", line 1132, in getresponse
response.begin()
File "/opt/vertica/oss/python/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/opt/vertica/oss/python/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
BadStatusLine: ''
how to fix it?
Answers
Vertica Analytic Database v8.1.1-9
Do you have AllowTcpForwarding set to "no" in your ssh daemon config? If so, that'll need to be changed to "yes".
No, I have AllowTcpForwarding seting to "Yes"
I think there would be a different error if the AllowTcpForwarding setting was set to "No".
Make sure you initialized the back up location.
See:
https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/AdministratorsGuide/BackupRestore/ConfiguringBackupHosts.htm
Also, check out the section about " S3 locking file reset command":
https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/AdministratorsGuide/BackupRestore/CreatingBackupsonAmazonS3.htm
Are you using IAM?
A BadStatusLine reported here generally indicates a problem with SSH port forwarding. Vbr makes connections to remote nodes over forwarded ports. The BadStatusLine error results when a connection is made, a request is sent but an invalid response comes back -- usually "invalid" is an empty response as a result of the connection being prematurely closed.
This may occur as a result of SSH configuration issues, or potentially issues with firewall rules or with binding to the right port on the remote host. One place to start diagnosis would be with ssh daemon logs on other cluster nodes.
I was referring to the online doc :
@sergey_h - Make sure you set AllowTcpForwarding = Yes on each DB node. Also, did you get the error "Connection refused" / "BadStatusLine: ''" immediately, or was it after some time has passed?
@jheffner - Do you think their could be a timeout (network/firewall.etc) that might result in @sergey_h's "...connection being prematurely closed?
@Jim_Knicely
On all nodes AllowTcpForwarding = Yes
I get error after some time, about 1-4% in coping.
My test case, if I create backup in S3 for small table - all Ok, has created backup without error. If big table - got errors.
@sergey_h If the backups succeed for small tables, I'd suggest looking at your environment. Do you have any bandwidth or network throttling setup?