EON Mode - Adding Nodes Error (Directory already exists)
I'm testing eon mode on version 11.0.0-3.
I'm adding and removing the same 2 AWS nodes to primary subcluster. Basically, i'm going from a single node to a 3 node and back down to 1 node (k-safe 0). When nodes 002 and 003 are not in the db i stop them in AWS to save $$. I get this error sometimes when i try to add the 002 and 003 nodes back in.
Failed to prepare a host for database participation. See /opt/vertica/log/adminTools.log for details. Host hints follow. 10.20.xx.xx directory already exists: [Details - <ATResult> status=Error host=10.20.xx.xx error_type=<class 'vertica.engine.api.errors.ATDBPrepareError'> error_message=Catalog parent directory already exists: (/vertica/data/demo1_04)]
I want to force recreate the directory if possible so i don't have to go to each of the two new nodes and delete the directory. This is for scripting purposes for auto scaling i'm working on.
I'm using this admin tools method
adminTools -t db_add_node -s 10.20.xx.xx,10.20.xx.xx -d demo1_04 -p 'pass'
When i look at the options i don't see a way to force recreate the directory. Here are the options i see
[dbadmin@ad2-demo1-04-node0001 root]$ admintools -t db_add_node -help Usage: db_add_node [options] Options: -h, --help show this help message and exit -d DB, --database=DB Name of the database -s HOSTS, --hosts=HOSTS Comma separated list of hosts to add to database -p DBPASSWORD, --password=DBPASSWORD Database password in single quotes -a AHOSTS, --add=AHOSTS Comma separated list of hosts to add to database -c SCNAME, --subcluster=SCNAME Name of subcluster for the new node --timeout=NONINTERACTIVE_TIMEOUT set a timeout (in seconds) to wait for actions to complete ('never') will wait forever (implicitly sets -i) -i, --noprompts do not stop and wait for user input(default false). Setting this implies a timeout of 20 min. --compat21 (deprecated) Use Vertica 2.1 method using node names instead of hostnames
Am i missing something? Do i need to fully remove the two nodes from the cluster so vertica drops the directory before i try to add them to db again? I want to be able to add and remove the same nodes from the db as smoothly as possible. This is the only error i'm running into right now.
Answers
I think i may have found my problem. The drives on node002 and node003 were not mounted when i started them up on AWS so that maybe have messed things up.
Vertica does delete the catalog and data directories when you remove node from the database.. I just tested it in EON.. it is working as expected.. Did you get the snippet similar to to below when you removed node? We don't remove the depot directory..
Waiting for rebalance shards for subcluster {'default_subcluster'}. We will wait for at most 36000 seconds.
Spread Remove Nodes
Attempting to drop node v_eondbtestdc1_node0003 ( 10.50.xxx.xxx )
** Deleting catalog and data directories**
Eon mode detected. The node v_eondbtestdc1_node0003 has been removed from host 10.50.xxx.xxx. To remove the node metadata completely, the database directories will need to be removed manually, please clean up the files corresponding to this node, at the communal location: s3://xxxx/testdcupload/metadata/eondbtestdc1/nodes/v_eondbtestdc1_node0003
Reload spread configuration
Replicating configuration to all nodes
Checking database state
Node Status:
Syncing catalog on eondbtestdc1 with 2000 attempts.