Not able to remove the node from DB
Hello!!
I am using 3 node cluster, all the nodes are running in aws ec2. one of the ec2 machine has problem where it is not booting up at all.
I am trying to remove this node from DB and host and trying to add a new host and node.
Since the Node status is in DOWN state i am not able to remove the node.
Node | Host | State | Version | DB
----------------------+-------------+-------+------------------+---------------------------
v_utilytics_node0001 | 10.69.2.249 | UP | vertica-8.1.1.17 | utilytics
v_utilytics_node0002 | 10.69.2.179 | DOWN | unavailable | utilytics
v_utilytics_node0003 | 10.69.2.245 | UP | vertica-8.1.1.17 | utilytics
i am trying to remove the host 10.69.2.179 from the database
[dbadmin@ip-10-69-2-249 ~]$ admintools -t db_remove_node -s 10.69.2.179 -d utilytics -p '*************'
connecting to 10.69.2.249
Before removing nodes from a database, all dependencies on the node must be removed.
This usually requires that schemas be redesigned so that the node to be removed is
no longer in use.
Do you wish to continue (y/N) [N]
y
y
Error removing node(s) from database.
['All nodes must be UP or STANDBY before dropping a node'].
How can i remove this node?
I a using vertica Vertica Analytic Database v8.1.1-17
Comments
Please open a support case as we might need to edit catalog to remove the down node.
Here is the procedure, (considering 3 node environment)
Make sure the other 2 nodes are UP (if not bring it up with start_db --force else with the latest stable epoch)
Change the node IP using VSQL "ALTER NODE HOSTNAME '';"
Change the nodes spread/control IP "ALTER NODE CONTROL HOSTNAME '';"
Re-write spread.conf with the new IP address and reload the running config – (db should remain UP) "SELECT RELOAD_SPREAD(true);"
On the UP node modify admintools.conf /opt/vertica/config/admintools.conf , replace the faulty ec2 server IP with New IP
Go to admintools '/opt/vertica/bin/admintools' and select “Restart Vertica on Host”
Once the above steps are completed, the new node will start initializing and recovering and then its state turns to UP.
Recovering time depends on your DB size, you can see the recovering status by entering the below query in vsql
SELECT * FROM TABLE_RECOVERY_STATUS;
Query in the above command missing the node_name and new_ip_address... below is the query
-- change the node IP
ALTER NODE node_name HOSTNAME 'new_ip_address';
-- change the nodes spread/control IP
ALTER NODE node_name CONTROL HOSTNAME 'new_ip_address';
-- re-write spread.conf with the new IP address and reload the running config
-- (db should remain UP)
SELECT RELOAD_SPREAD(true)