EON Mode - Adding Nodes Error (Directory already exists)

GoCougsGoCougs Vertica Customer ✭✭

I'm testing eon mode on version 11.0.0-3.

I'm adding and removing the same 2 AWS nodes to primary subcluster. Basically, i'm going from a single node to a 3 node and back down to 1 node (k-safe 0). When nodes 002 and 003 are not in the db i stop them in AWS to save $$. I get this error sometimes when i try to add the 002 and 003 nodes back in.

Failed to prepare a host for database participation.
See /opt/vertica/log/adminTools.log for details. Host hints follow.
10.20.xx.xx directory already exists: [Details - <ATResult> status=Error host=10.20.xx.xx error_type=<class 'vertica.engine.api.errors.ATDBPrepareError'> error_message=Catalog parent directory already exists: (/vertica/data/demo1_04)] 

I want to force recreate the directory if possible so i don't have to go to each of the two new nodes and delete the directory. This is for scripting purposes for auto scaling i'm working on.

I'm using this admin tools method

adminTools -t db_add_node -s 10.20.xx.xx,10.20.xx.xx -d demo1_04 -p 'pass'

When i look at the options i don't see a way to force recreate the directory. Here are the options i see

[dbadmin@ad2-demo1-04-node0001 root]$ admintools -t db_add_node -help
Usage: db_add_node [options]

Options:
  -h, --help            show this help message and exit
  -d DB, --database=DB  Name of the database
  -s HOSTS, --hosts=HOSTS
                        Comma separated list of hosts to add to database
  -p DBPASSWORD, --password=DBPASSWORD
                        Database password in single quotes
  -a AHOSTS, --add=AHOSTS
                        Comma separated list of hosts to add to database
  -c SCNAME, --subcluster=SCNAME
                        Name of subcluster for the new node
  --timeout=NONINTERACTIVE_TIMEOUT
                        set a timeout (in seconds) to wait for actions to
                        complete ('never') will wait forever (implicitly sets
                        -i)
  -i, --noprompts       do not stop and wait for user input(default false).
                        Setting this implies a timeout of 20 min.
  --compat21            (deprecated) Use Vertica 2.1 method using node names
                        instead of hostnames

Am i missing something? Do i need to fully remove the two nodes from the cluster so vertica drops the directory before i try to add them to db again? I want to be able to add and remove the same nodes from the db as smoothly as possible. This is the only error i'm running into right now.

Best Answers

  • GoCougsGoCougs Vertica Customer ✭✭
    Answer ✓

    @SruthiA thanks for the response. I believe the issue i had was the drives not mounted when they started up in AWS. I will need to check that before adding the nodes.

  • GoCougsGoCougs Vertica Customer ✭✭
    edited December 2021 Answer ✓

    ^^

Answers

  • GoCougsGoCougs Vertica Customer ✭✭

    I think i may have found my problem. The drives on node002 and node003 were not mounted when i started them up on AWS so that maybe have messed things up.

  • SruthiASruthiA Administrator

    Vertica does delete the catalog and data directories when you remove node from the database.. I just tested it in EON.. it is working as expected.. Did you get the snippet similar to to below when you removed node? We don't remove the depot directory..

    Waiting for rebalance shards for subcluster {'default_subcluster'}. We will wait for at most 36000 seconds.
    Spread Remove Nodes
    Attempting to drop node v_eondbtestdc1_node0003 ( 10.50.xxx.xxx )
    ** Deleting catalog and data directories**
    Eon mode detected. The node v_eondbtestdc1_node0003 has been removed from host 10.50.xxx.xxx. To remove the node metadata completely, the database directories will need to be removed manually, please clean up the files corresponding to this node, at the communal location: s3://xxxx/testdcupload/metadata/eondbtestdc1/nodes/v_eondbtestdc1_node0003
    Reload spread configuration
    Replicating configuration to all nodes
    Checking database state
    Node Status:
    Syncing catalog on eondbtestdc1 with 2000 attempts.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file