Keepalive settings for AWS Network Load Balancer
Serge Bonte authored this post.
Network load balancers are one of the three types of load balancers supported by Amazon’s Elastic Load Balancing. For detailed information, see What is a Network Load Balancer?
Since load balancers act as a proxy between clients (such as JDBC) and Vertica servers, it is important to understand how AWS’s NLB handle idle timeouts for connections.
The idle timeout value is set at 350 seconds and cannot be changed. The timeout applies to both connection points. For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned when the server fails to send a keepalive within 350 seconds.
How to identify that type of issue?
A good way to identify an idle timeout/keepalive issue is to run a query like this via a client such as JDBC:
=> SELECT SLEEP(355);
How to diagnose and correct that type of issue?
If the client connection is terminated before 355 seconds, the JDBC keepalive setting has to be lowered so that keepalives are sent less than 350 seconds apart.
If the client connection doesn’t return a result after 355 seconds, the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) have to be adjusted so that keepalives are sent less than 350 seconds apart.
Note: Adjusting keepalive/idle timeout settings for AWS “Classic” Load balancers was covered in this previous blog post: AWS Elastic Load Balancing with Vertica