How to setup IP connectivity for clients to access Vertica Cluster nodes
We have a 4 node Vertica cluster. Now to setup client access to Vertica database in the cluster which node IP address should be used ? If only 1 node's IP can be used what happens if that node is down ? Is it mandatory to deploy Load Balancer VIP configurations ? The Vertica 5.0 documentation has some instructions on using load balancers, the same is missing in Vertica 6.x documentation, does that mean Load Balancers are not required in Vertica 6.x
0
Comments
>> If only 1 node's IP can be used what happens if that node is down ?
adminTools will carry about it, adminTools will try sort of round robin algorithm for connectivity, i.e. if node is down adminTools will connect you to other node, which is up.
>> Is it mandatory to deploy Load Balancer VIP configurations ?
No.
FYI: actually Load Balancer is single point failure, so you need do carry about it too.
>> does that mean Load Balancers are not required in Vertica 6.x
No, but btw docs for version 6 has documentation about Load Balancing : https://my.vertica.com/docs/6.1.x/HTML/index.htm#15242.htm
By your logic you need it (if it in documentation).
Often, people need to connect via vsql or JDBC/ODBC/etc, rather than through adminTools. Vertica 6.x doesn't automatically round-robin those tools.
That said, we know that people would like load-balancing technology. Please do read the documentation for new versions of Vertica as they come out, including Vertica 7.0 (which we are in the process of publishing as we speak).
Vertica allows you to connect to the IP address of any node. You mention only being able to connect to one node? Is this because of a firewall that you have configured in front of the Vertica cluster? If you have some sort of firewall that restricts access to one IP address, that address will typically be a single point of failure -- it's a single address; if it stops working for any reason (the Vertica node, the firewall, etc), you lose access to your cluster.
Some load balancing solutions have instant-failover functionality that avoids having a single IP become a single point of failure. Our bundled load-balancing solution doesn't have that feature out of the box; proper instant failover is complex, and making a few IP addresses public instead of one is typically something that people are comfortable with doing in the name of high availability. However, if you get in contact with any of our sales reps, I'm confident that they'd be glad to sell you other load-balancing products that can do this :-)
Adam
There are no firewalls, however i am not sure how to configure/setup the client machines to use multiple IP addresses.
For example can i provide multiple IP addresses for the Servername parameter in the ODBC/JDBC client configurations ? If not, how should one configure the client libraries when using ODBC/JDBC ?
Thanks,
Guhen
Programmatically only. Provide all IPs of cluster nodes to client app.
If you have access to one node (even if it down), so you can read adminTools or spread configuration file to get other IPs of cluster.
You can get the list of IP's by running the query:
SELECT * FROM nodes;
In current versions of Vertica, Daniel is correct that you'd have to implement this sort of load-balancing yourself. The simplest/naive way is to just expose the server as a config option in your application, and manually switch IP's if a node fails. A less-painful solution is to query the above table up front, and store the list of server IP's; then just have a retry loop that tries to connect to each IP until it finds one that works.
Again, please do read Vertica 7's documentation on this functionality as soon as it's available, as it works somewhat differently than in previous versions.
2) Also proxy, but another strategy: same IP but different ports - port per node and with ssh tunneling or port forwarding to redirect a connection.
- So other nodes IP are not compromised.
- only one IP used for connection
but its still single point failure
Thank you
Tonight we've tried to construct DNS CNAME round robin keeping off ip bound. RFC 2181 does not recommend CNAME round robin, so we're only half way for bullet-proof architecture. If HP Vertica could provide DNS SRV handling for clients it would make us last half way complete.
Thanks for info. But after some reading I understand that round-robin DNS provides rudimentary load balancing, but it's not failure-aware. Can you share how do you resolves it?
Thanks.