Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!
Removing unresponsive servers from the cluster
We have had non-responsive servers freeze the cluster In one case, the server responded to ping but not to ssh. In another case the server responded to ping and to ssh but we couldn't get in through ssh. Anyway, in both cases the cluster became un-responsive. In the second case, the cluster responded again once we sent a shutdown to the server. When the server was brought up, recovery has been slow (still going after 3+ days, under a terabyte of today data). In the first case, I can't shutdown (no ssh) so the cluster is unresponsive. I can run vsql but cannot run commands from vsql (select now(); hangs) I can run admintool from another server in the cluster but any command hangs. In particular the shutdown database command has hung. a) Anyway, if a server responds to ping but cannot be shutdown. How can I restore the cluster to functionality? b) If a server is taking inordinate amounts of time to recover Is there some way, perhaps that I can just restart with a fresh server? admintool won't allow me to remove a server when some nodes are down.