Error: failed to open connection to node
We get this error while trying to run a query that joins few objects in v_catalog and an object in v_monitor.
"[Vertica][VJDBC](4054) ERROR: NetworkSend on <node name>: failed to open connection to node <Node Name>(Transport channel closed)"
Here is the list of objects involved in the query.
v_catalog.tables
v_monitor.projection_storage
v_catalog.projections
v_catalog.tables
It runs fine if we rerun this query after it fails, could not find a reason why the query fails. Any suggestion would be greatly appreciated.
0
Comments
Hi Guna,
You said that the error goes away when you retry. When does it come back? Does it only happen some of the time? If so, how frequently? 1%, 10%, 50%?
The error indicates that Vertica's node-to-node communication has failed. Network connections are vulnerable to environmental interferance (i.e. network hiccups) and is more likely in a virtualized environment, as they usually have less reliable networking between the nodes.
node-to-node communication failures typically cause the query to be internally retried. Vertica will retry a limited number of times before showing the error to the user.
Derrick
Thank you for the information DerrickR. This is a VM environment and this error occurs rarely (< 1%). Will take it to the notice of Sys. Admin. and request to monitor the network connectivity between the nodes.
Guna,
Glad to be of help. Keep in mind the scaling function in this case...
Vertica makes a connection between every pair of nodes. It must, because data from any node may need to go to any other node for execution.
If you have N nodes, then you have N-Squared total connections. As N increases, the number of connections grows rather quickly.
If you are observing a 1% failure rate, that indicates something like 0.01/(N^2) failure rate on your network. So your sys admin may claim that things are very healthy, but even a very low percentage of failure can cause the errors you are seeing.
A stable and reliable node-to-node network is an assumption of Vertica's current architecture. Vertica is specifically not well suited to a Wide-Area-Network or Globally Distributed cluster layout.