Set up a 3 nodes cluster with cloud private IP, facing error

Ting_caiTing_cai Vertica Customer Employee

Hello team,
(He deployed vertica on Ali Cloud server with IP xxx.xx.xxx.46.
(note:OS: centos6.9, Vertica Community Edition 10.0)
I set up a 3 nodes cluster with cloud private IP, facing the below error by using S3 as communal storage.


badmin@asd ~]$ admintools -t create_db -x auth_params.conf -s '172.16.133.44,172.16.133.45,172.16.133.46' -d VMart1 -p 123 --depot-path=/home/dbadmin/depot --shard-count=6 --communal-storage-location=s3://xxxx-xxxx-x -D /home/dbadmin/data/ -c /home/dbadmin/catalog/ --depot-size 5G
Distributing changes to cluster.
 Creating database VMart1
 Starting bootstrap node v_vmart1_node0002 (172.16.133.45)
 Starting nodes: 
  v_vmart1_node0002 (172.16.133.45)
 Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
 Node Status: v_vmart1_node0002: (DOWN) 
 Node Status: v_vmart1_node0002: (DOWN) 
 Node Status: v_vmart1_node0002: (DOWN) 
 Node Status: v_vmart1_node0002: (DOWN) 
 Node Status: v_vmart1_node0002: (DOWN) 
 Node Status: v_vmart1_node0002: (UP) 
 Creating database nodes
 Creating node v_vmart1_node0001 (host 172.16.133.44)
 Creating node v_vmart1_node0003 (host 172.16.133.46)
 Generating new configuration information
 Stopping single node db before adding additional nodes.
  Database shutdown complete
 Starting all nodes
Start hosts = ['172.16.133.44', '172.16.133.45', '172.16.133.46']
 Starting nodes: 
  v_vmart1_node0002 (172.16.133.45)
  v_vmart1_node0001 (172.16.133.44)
  v_vmart1_node0003 (172.16.133.46)
 Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
It is suggested that you continue waiting.
Do you want to continue waiting? (yes/no) [yes] 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
It is suggested that you continue waiting.
Do you want to continue waiting? (yes/no) [yes] 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
 Node Status: v_vmart1_node0001: (DOWN) v_vmart1_node0002: (DOWN) v_vmart1_node0003: (DOWN) 
ERROR: Not all nodes came up, but not all down.  Run scrutinize.
Unable to establish vsql script connection: Unable to connect to 'VMart1'
Unable to establish client-server connection: Unable to connect to 'VMart1'
Unable to create depot storage locations (if Eon) without a client-server connection.
Unable to rebalance shards (if Eon) without a client-server connection.
Unable to set K-safety value without a client-server connection.
Unable to install default extension packages without a vsql script connection
Unable to sync database catalog (if Eon) without a client-server connection.
Database creation SQL tasks included one or more failures (see above).
Database VMart1 created successfully, some nodes may have had startup problems.


Then, when I wanted to switch to Public IP address, remove the private IP address of the rest 2 nodes in server 172.16.133.44 and add public IP address, everything is normal.

However deleting “172.16.133.44” with the node which is connected with public IP ,previous error happened again as belows:


[root@asd ~]# /opt/vertica/sbin/update_vertica --remove-hosts '172.16.133.44'
Vertica Analytic Database 10.0.0-0 Installation Tool

Validating options...

Mapping hostnames in --remove-hosts (-R) to addresses...
Error: cannot find which cluster host is the local host.
Hint: Is this node in the cluster? Did its IP address change?
Installation FAILED with errors.
Installation stopped before any changes were made.


Below is the admintools.log


[root@asd ~]# cat /opt/vertica/config/admintools.conf
[SSHConfig]
ssh_user =
ssh_ident =
ssh_options = -oConnectTimeout=30 -o TCPKeepAlive=no -o ServerAliveInterval=15 -o ServerAliveCountMax=2 -o StrictHostKeyChecking=no -o BatchMode=yes
[BootstrapParameters]
awsendpoint = null
awsregion = null
[Configuration]
format = 3
install_opts = --add-hosts '47.114.xxx.xx' --rpm 'vertica-10.0.0-0.x86_64.RHEL6.rpm'
default_base = /home/dbadmin
controlmode = broadcast
controlsubnet = default
spreadlog = False
last_port = 5433
tmp_dir = /tmp
ipv6 = False
atdebug = False
atgui_default_license = False
unreachable_host_caching = True
aws_metadata_conn_timeout = 2
rebalance_shards_timeout = 36000
database_state_change_poll_timeout = 21600
wait_for_shutdown_timeout = 3600
pexpect_verbose_logging = False
sync_catalog_retries = 2000
admintools_config_version = 109
[Cluster]
hosts = 172.16.133.46,47.114.xx.xx
[Nodes]
node0001 = 172.16.133.46,/home/dbadmin,/home/dbadmin
node0002 = 47.114.xx.xx,/home/dbadmin,/home/dbadmin


Thanks in advance.

Answers

  • Ting_caiTing_cai Vertica Customer Employee

    Hello team, please kindly check the issue and assist. Many thanks

  • HibikiHibiki Vertica Employee Employee

    If you are still facing this issue, please reinstall the Vertica runtime and run install_vertica script with the private IPs. Then, if you see all nodes cannot be UP, please share the log files with me.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file