Set up a 3 nodes cluster with cloud private IP, facing error
Hello team,
(He deployed vertica on Ali Cloud server with IP xxx.xx.xxx.46.
(note:OS: centos6.9, Vertica Community Edition 10.0)
I set up a 3 nodes cluster with cloud private IP, facing the below error by using S3 as communal storage.
badmin@asd ~]$ admintools -t create_db -x auth_params.conf -s '172.16.133.44,172.16.133.45,172.16.133.46' -d VMart1 -p 123 --depot-path=/home/dbadmin/depot --shard-count=6 --communal-storage-location=s3://xxxx-xxxx-x -D /home/dbadmin/data/ -c /home/dbadmin/catalog/ --depot-size 5G
Distributing changes to cluster.
Creating database VMart1
Starting bootstrap node v_vmart1_node0002 (172.16.133.45)
Starting nodes:
v_vmart1_node0002 (172.16.133.45)
Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
Node Status: v_vmart1_node0002: (DOWN)
Node Status: v_vmart1_node0002: (DOWN)
Node Status: v_vmart1_node0002: (DOWN)
Node Status: v_vmart1_node0002: (DOWN)
Node Status: v_vmart1_node0002: (DOWN)
Node Status: v_vmart1_node0002: (UP)
Creating database nodes
Creating node v_vmart1_node0001 (host 172.16.133.44)
Creating node v_vmart1_node0003 (host 172.16.133.46)
Generating new configuration information
Stopping single node db before adding additional nodes.
Database shutdown complete
Starting all nodes
Start hosts = ['172.16.133.44', '172.16.133.45', '172.16.133.46']
Starting nodes:
v_vmart1_node0002 (172.16.133.45)
v_vmart1_node0001 (172.16.133.44)
v_vmart1_node0003 (172.16.133.46)
Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
It is suggested that you continue waiting.
Do you want to continue waiting? (yes/no) [yes]
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
It is suggested that you continue waiting.
Do you want to continue waiting? (yes/no) [yes]
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN)
ERROR: Not all nodes came up, but not all down. Run scrutinize.
Unable to establish vsql script connection: Unable to connect to 'VMart1'
Unable to establish client-server connection: Unable to connect to 'VMart1'
Unable to create depot storage locations (if Eon) without a client-server connection.
Unable to rebalance shards (if Eon) without a client-server connection.
Unable to set K-safety value without a client-server connection.
Unable to install default extension packages without a vsql script connection
Unable to sync database catalog (if Eon) without a client-server connection.
Database creation SQL tasks included one or more failures (see above).
Database VMart1 created successfully, some nodes may have had startup problems.
Then, when I wanted to switch to Public IP address, remove the private IP address of the rest 2 nodes in server 172.16.133.44 and add public IP address, everything is normal.
However deleting “172.16.133.44” with the node which is connected with public IP ,previous error happened again as belows:
[root@asd ~]# /opt/vertica/sbin/update_vertica --remove-hosts '172.16.133.44'
Vertica Analytic Database 10.0.0-0 Installation Tool
Validating options...
Mapping hostnames in --remove-hosts (-R) to addresses...
Error: cannot find which cluster host is the local host.
Hint: Is this node in the cluster? Did its IP address change?
Installation FAILED with errors.
Installation stopped before any changes were made.
Below is the admintools.log
[root@asd ~]# cat /opt/vertica/config/admintools.conf
[SSHConfig]
ssh_user =
ssh_ident =
ssh_options = -oConnectTimeout=30 -o TCPKeepAlive=no -o ServerAliveInterval=15 -o ServerAliveCountMax=2 -o StrictHostKeyChecking=no -o BatchMode=yes
[BootstrapParameters]
awsendpoint = null
awsregion = null
[Configuration]
format = 3
install_opts = --add-hosts '47.114.xxx.xx' --rpm 'vertica-10.0.0-0.x86_64.RHEL6.rpm'
default_base = /home/dbadmin
controlmode = broadcast
controlsubnet = default
spreadlog = False
last_port = 5433
tmp_dir = /tmp
ipv6 = False
atdebug = False
atgui_default_license = False
unreachable_host_caching = True
aws_metadata_conn_timeout = 2
rebalance_shards_timeout = 36000
database_state_change_poll_timeout = 21600
wait_for_shutdown_timeout = 3600
pexpect_verbose_logging = False
sync_catalog_retries = 2000
admintools_config_version = 109
[Cluster]
hosts = 172.16.133.46,47.114.xx.xx
[Nodes]
node0001 = 172.16.133.46,/home/dbadmin,/home/dbadmin
node0002 = 47.114.xx.xx,/home/dbadmin,/home/dbadmin
Thanks in advance.
Answers
Hello team, please kindly check the issue and assist. Many thanks
If you are still facing this issue, please reinstall the Vertica runtime and run install_vertica script with the private IPs. Then, if you see all nodes cannot be UP, please share the log files with me.