Vertica not initializing when creating new database
I am continuously seeing this and DB is not coming up. Looked at the logs and don;t see what might be wrong:
---------------
Do you want to continue waiting? (yes/no) [yes] yes
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Nodes DOWN: v_test_node0003, v_test_node0002, v_test_node0001 (may be still initializing).
--------------
2015-06-16 16:54:28.000 Timer Service:0x81747d0 [Util] <INFO> Task 'LicenseSizeAuditor' enabled
2015-06-16 16:54:28.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:28.908 Init Session:0x7fd398010fd0 <LOG> @[initializing]: 00000/2705: Connection received: host=10.241.251.203 port=36758 (connCnt 1)
2015-06-16 16:54:28.908 Init Session:0x7fd398010fd0 <LOG> @[initializing]: 00000/4540: Received SSL negotiation startup packet
2015-06-16 16:54:28.908 Init Session:0x7fd398010fd0 <LOG> @[initializing]: 00000/4691: Sending SSL negotiation response 'N'
2015-06-16 16:54:28.909 Init Session:0x7fd398010fd0 <FATAL> @[initializing]: {SessionRun} 57V03/5785: Cluster Status Request by 10.241.251.203:36758
HINT: Cluster State: test
--Waiting for cluster invitation
----
LOCATION: initSession, /scratch_a/release/30493/vbuild/vertica/Session/ClientSession.cpp:436
2015-06-16 16:54:29.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:30.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:31.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:32.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
Comments
Hi,
Can you check the startup.log file, if it has something like below
{
"node" : <node name>
"stage" : "Waiting for Cluster Invite",
"text" : "Ready to be invited",
}
-Regards,
Sruthi
{
"node" : "v_test_node0002",
"stage" : "Waiting for Cluster Invite",
"text" : "Prepare to be invited",
"timestamp" : "2015-06-16 16:50:37.000"
}
{
"node" : "v_test_node0002",
"stage" : "Waiting for Cluster Invite",
"text" : "Ready to be invited",
"timestamp" : "2015-06-16 16:50:37.144"
}
Hi,
Can you please run the following command /opt/vertica/bin/vnetperf --condense --hosts host1,host2...
and share the output
-Regards,
Sruthi
Please find in the attachment
Hi,
Could you please start creating database and give the option no when it prompts for Do you want to continue waiting?. This provides us with the exact error message and helps to narrow down the issue.
-Regards,
Sruthi
Do you want the error message from the console where I say "no"?
Please find
*** Creating database: test ***
10.241.251.203 OK [vertica][(7, 1, 1)][000][x86_64]
10.241.251.132 OK [vertica][(7, 1, 1)][000][x86_64]
10.241.251.134 OK [vertica][(7, 1, 1)][000][x86_64]
Checking full connectivity
Creating database test
Starting bootstrap node v_test_node0001 (10.241.251.203)
Starting nodes:
v_test_node0001 (10.241.251.203)
Starting Vertica on all nodes. Please wait, databases with large catalogs may take a while to initialize.
Node Status: v_test_node0001: (INITIALIZING)
Node Status: v_test_node0001: (INITIALIZING)
Node Status: v_test_node0001: (INITIALIZING)
Node Status: v_test_node0001: (UP)
Creating database nodes
Creating node v_test_node0002 (host 10.241.251.132)
Creating node v_test_node0003 (host 10.241.251.134)
Generating new configuration information
Stopping bootstrap node
Starting all nodes
Starting nodes:
v_test_node0001 (10.241.251.203)
v_test_node0002 (10.241.251.132)
v_test_node0003 (10.241.251.134)
Starting Vertica on all nodes. Please wait, databases with large catalogs may take a while to initialize.
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Nodes DOWN: v_test_node0003, v_test_node0002, v_test_node0001 (may be still initializing).
It is suggested that you continue waiting.
Do you want to continue waiting? (yes/no) [yes] no
ERROR: Not all nodes came up, but not all down. Run scrutinize.
Press RETURN to continue
Hi,
Can you check if all nodes are UP? issue select * from nodes.
-Regards,
Sruthi
This is the problem that nodes are not coming up
[root@ip-10-241-251-203 ~]# !99
/opt/vertica/bin/vsql --no-vsqlrc -n -p 5433 -h '10.241.251.134' test
vsql: FATAL 4149: Node startup/recovery in progress. Not yet ready to accept connections
Hi,
Can you disable the firewall as mentioned in the documentation and try creating a database. I think this should work
https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/InstallationGuide/BeforeYouInstall/iptablesEnabled.htm%3FTocPath%3DInstallation%20Guide
-Regards,
Sruthi
iptables are not running on any node. What makes you believe it's a iptables issue?
Hi,
Since, nodes are not ready to accept connections. I thought there might be firewall issue blocking their communication. Can you check if spread is running on all nodes of your cluster?
ps -ef | grep spread // issue this command on all three nodes of the cluster and share the output.
Can you attach vertica.log as well?
-Regards,
Sruthi
[root@ip-10-241-251-203 ~]# for i in host1 host2 host3; do echo "----" ; ssh $i "ps -eaf|grep spread"; done
----
root 1998 17903 0 20:01 pts/0 00:00:00 ssh host1 ps -eaf|grep spread
root 2001 1999 0 20:01 ? 00:00:00 bash -c ps -eaf|grep spread
root 2009 2001 0 20:01 ? 00:00:00 grep spread
vertica 19067 1 0 17:51 pts/1 00:00:01 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0001_catalog/spread.conf
----
vertica 16074 1 0 17:51 ? 00:00:01 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0002_catalog/spread.conf
root 31023 31021 0 20:01 ? 00:00:00 bash -c ps -eaf|grep spread
root 31031 31023 0 20:01 ? 00:00:00 grep spread
----
vertica 15953 1 0 17:51 ? 00:00:01 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0003_catalog/spread.conf
root 30890 30888 0 20:01 ? 00:00:00 bash -c ps -eaf|grep spread
root 30898 30890 0 20:01 ? 00:00:00 grep spread
[root@ip-10-241-251-203 ~]#
Hi,
I checked the vertica.log file and looks like there are some issues with spread.
2015-06-17 17:51:48.370 Init Session:0x7f89f8010fc0-a000000000000d [Comms] <INFO> Sending reload command to spread
2015-06-17 17:51:48.370 Poll dispatch:0x8005ef0 [Comms] <INFO> Sent reload command to spread daemon
2015-06-17 17:51:50.371 Init Session:0x7f89f8010fc0-a000000000000d <WARNING> @v_test_node0001: 01000/4539: Received no response from v_test_node0002 in reload spread config
2015-06-17 17:51:50.371 Init Session:0x7f89f8010fc0-a000000000000d <WARNING> @v_test_node0001: 01000/4539: Received no response from v_test_node0003 in reload spread config
2015-06-17 17:51:50.371 Init Session:0x7f89f8010fc0-a000000000000d [Txn] <INFO> Rollback Txn: a000000000000d 'reloadSpreadConfig'
2015-06-17 17:51:50.574 Init Session:0x7f89f8010fc0 <LOG> @v_test_node0001: 00000/4719: Session ip-10-241-251-203.u-18863:0x13 ended; closing connection (connCnt 1)
Could you please check if the owner of vspread.conf is dbadmin?
ls -l /opt/vertica/config/vspread.conf
-Regards,
Sruthi
Hi
If this db is currently down.I would suggest you to give it a try start the nodes individually on all the three nodes using below command as an example:
For e,g:
/opt/vertica/bin/vertica -D /home/dbadmin/test_crane/v_test_crane_node0001_catalog -C test_crane -n v_test_crane_node0001 -h 10.50.52.41 -p 5433 -P 4803 -Y ipv4
You can find similiar command in your vertica.log files at the beginning & can run that individually changing the corresponding ip & node name.
Let me know if this helps.
Regards
Rahul Choudhary
this is the only vspread.conf I see:
root@ip-10-241-251-203 ~]# ls -lltr /opt/vertica/agent/test/config/vspread.conf
-rw-r--r--. 1 root root 222 Dec 17 2012 /opt/vertica/agent/test/config/vspread.conf
Hi,
if all nodes are UP, I would suggest you to try restarting spread using the following command on all nodes of your cluster
sudo /etc/init.d/spreadd restart
After restarting spread, create database.
If the nodes are down, bring them UP by using the procedure what rahul has mentioned.
-Regards,
Sruthi
How can I tell what state I am in? These are the processes running:
[root@ip-10-241-251-203 ~]# for i in host1 host2 host3
> do
> ssh $i "ps -eaf|grep vertica"
> done
vertica 3288 1 0 Jun16 ? 00:00:00 /bin/bash /opt/vertica/agent/agent.sh /opt/vertica/config/users/vertica/agent.conf
vertica 3295 3288 0 Jun16 ? 00:08:08 /opt/vertica/oss/python/bin/python ./simply_fast.py
vertica 19067 1 0 Jun17 ? 00:00:31 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0001_catalog/spread.conf
vertica 19069 1 0 Jun17 ? 00:21:12 /opt/vertica/bin/vertica -D /home/vertica/test/v_test_node0001_catalog -C test -n v_test_node0001 -h 10.241.251.203 -p 5433 -P 4803 -Y ipv4
vertica 19072 19069 0 Jun17 ? 00:00:32 /opt/vertica/bin/vertica-udx-zygote 12 10 19069 debug-log-off /home/vertica/test/v_test_node0001_catalog/UDxLogs
root 27472 22884 0 13:41 pts/0 00:00:00 ssh host1 ps -eaf|grep vertica
root 27475 27473 0 13:41 ? 00:00:00 bash -c ps -eaf|grep vertica
root 27483 27475 0 13:41 ? 00:00:00 grep vertica
vertica 2939 1 0 Jun16 ? 00:00:00 /bin/bash /opt/vertica/agent/agent.sh /opt/vertica/config/users/vertica/agent.conf
vertica 2946 2939 0 Jun16 ? 00:00:44 /opt/vertica/oss/python/bin/python ./simply_fast.py
vertica 16074 1 0 Jun17 ? 00:00:34 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0002_catalog/spread.conf
vertica 16076 1 0 Jun17 ? 00:16:34 /opt/vertica/bin/vertica -D /home/vertica/test/v_test_node0002_catalog -C test -n v_test_node0002 -h 10.241.251.132 -p 5433 -P 4803 -Y ipv4
vertica 16079 16076 0 Jun17 ? 00:00:33 /opt/vertica/bin/vertica-udx-zygote 12 10 16076 debug-log-off /home/vertica/test/v_test_node0002_catalog/UDxLogs
root 23945 23943 0 13:41 ? 00:00:00 bash -c ps -eaf|grep vertica
root 23953 23945 0 13:41 ? 00:00:00 grep vertica
vertica 2930 1 0 Jun16 ? 00:00:00 /bin/bash /opt/vertica/agent/agent.sh /opt/vertica/config/users/vertica/agent.conf
vertica 2937 2930 0 Jun16 ? 00:00:45 /opt/vertica/oss/python/bin/python ./simply_fast.py
vertica 15953 1 0 Jun17 ? 00:00:29 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0003_catalog/spread.conf
vertica 15955 1 0 Jun17 ? 00:14:59 /opt/vertica/bin/vertica -D /home/vertica/test/v_test_node0003_catalog -C test -n v_test_node0003 -h 10.241.251.134 -p 5433 -P 4803 -Y ipv4
vertica 15958 15955 0 Jun17 ? 00:00:30 /opt/vertica/bin/vertica-udx-zygote 12 10 15955 debug-log-off /home/vertica/test/v_test_node0003_catalog/UDxLogs
root 23827 23825 0 13:41 ? 00:00:00 bash -c ps -eaf|grep vertica
root 23835 23827 0 13:41 ? 00:00:00 grep vertica
[root@ip-10-241-251-203 ~]#
HI,
You can know it in either of the two ways
1) Go to vsql and issue the query
select * from nodes. It will show you the state of all nodes in the vertica cluster.
2) Open admintooks and click on "View Database Cluster State", it will show you the status of all the nodes in the vertica cluster.
-Regards,
Sruthi
I am unable to determine the status:
1) When I execute adminTools -> View Status it immediately exits:
adminTools Last Chance Error Handler running...
raised error: <class 'ConfigParser.NoOptionError'>
error message: No option 'v_test_node0001' in section: 'Nodes'
trace file: /opt/vertica/log/adminTools-vertica.errors
REPORT THIS INFORMATION TO TECHNICAL SUPPORT
AND INCLUDE CONTENTS OF THE TRACE FILE IN YOUR REPORT
2) when I type vsql I get:
[vertica@ip-10-241-251-220 ~]$ vsql
vsql: FATAL 4149: Node startup/recovery in progress. Not yet ready to accept connections
[vertica@ip-10-241-251-220 ~]$
Hi,
Can you share admintools.conf file from 3 nodes. it is present in the directory /opt/vertica/config
From any node, after you login please type the follow and share me the output
rpm -qa|grep vertica
-Regards,
Sruthi
[vertica@ip-10-241-251-220 ~]$ rpm -qa|grep vertica
vertica-7.1.1-0.x86_64
As for conf, it is in 3 locations. Which one should I attach?
/opt/vertica/agent/test/support/config/admintools.conf
/opt/vertica/agent/test/config/admintools.conf
/opt/vertica/config/admintools.conf
Hi,
COuld you please share the one from /opt/vertica/config/admintools.conf
-Regards,
Sruthi
[root@ip-10-241-251-220 ~]# cat /opt/vertica/config/admintools.conf
[Configuration]
last_port = 5433
tmp_dir = /tmp
default_base = /home/dbadmin
format = 3
install_opts = -s 'host1,host2,host3' -r './v/vertica-7.1.1-0.x86_64.RHEL5.rpm' -u vertica --failure-threshold NONE
spreadlog = False
controlsubnet = default
controlmode = broadcast
[Cluster]
hosts = 10.241.251.220,10.241.251.144,10.241.251.145
[Nodes]
node0001 = 10.241.251.220,/home/vertica,/home/vertica
node0002 = 10.241.251.144,/home/vertica,/home/vertica
node0003 = 10.241.251.145,/home/vertica,/home/vertica
[Database:test]
restartpolicy = ksafe
port = 5433
path = /home/vertica/test/v_test_node0001_catalog
nodes = v_test_node0001,v_test_node0002,v_test_node0003