Vertica not initializing when creating new database

I am continuously seeing this and DB is not coming up. Looked at the logs and don;t see what might be wrong:

 

---------------

Do you want to continue waiting? (yes/no) [yes] yes
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
Nodes DOWN: v_test_node0003, v_test_node0002, v_test_node0001 (may be still initializing).

--------------

 

2015-06-16 16:54:28.000 Timer Service:0x81747d0 [Util] <INFO> Task 'LicenseSizeAuditor' enabled
2015-06-16 16:54:28.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:28.908 Init Session:0x7fd398010fd0 <LOG> @[initializing]: 00000/2705: Connection received: host=10.241.251.203 port=36758 (connCnt 1)
2015-06-16 16:54:28.908 Init Session:0x7fd398010fd0 <LOG> @[initializing]: 00000/4540: Received SSL negotiation startup packet
2015-06-16 16:54:28.908 Init Session:0x7fd398010fd0 <LOG> @[initializing]: 00000/4691: Sending SSL negotiation response 'N'
2015-06-16 16:54:28.909 Init Session:0x7fd398010fd0 <FATAL> @[initializing]: {SessionRun} 57V03/5785: Cluster Status Request by 10.241.251.203:36758
HINT: Cluster State: test
--Waiting for cluster invitation
----
LOCATION: initSession, /scratch_a/release/30493/vbuild/vertica/Session/ClientSession.cpp:436
2015-06-16 16:54:29.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:30.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:31.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node
2015-06-16 16:54:32.001 nameless:0x816a600 [Catalog] <INFO> getLocalStorageLocations: no local node

 

Comments

  • SruthiASruthiA Administrator

    Hi,

     

      Can you check the startup.log file, if it has something like below

     

    {
      "node" : <node name>
      "stage" : "Waiting for Cluster Invite",
      "text" : "Ready to be invited",

    }

     

    -Regards,

     Sruthi

  • {
    "node" : "v_test_node0002",
    "stage" : "Waiting for Cluster Invite",
    "text" : "Prepare to be invited",
    "timestamp" : "2015-06-16 16:50:37.000"
    }

    {
    "node" : "v_test_node0002",
    "stage" : "Waiting for Cluster Invite",
    "text" : "Ready to be invited",
    "timestamp" : "2015-06-16 16:50:37.144"
    }

  • SruthiASruthiA Administrator

    Hi,

     

      Can you please run the following command /opt/vertica/bin/vnetperf --condense --hosts host1,host2...

     and share the output

     

    -Regards,

     Sruthi

     

  • Please find in the attachment

  • SruthiASruthiA Administrator

    Hi,

     

      Could you please start creating database and give the option no when it prompts for Do you want to continue waiting?. This provides us with the exact error message and helps to narrow down the issue.

     

    -Regards,

     Sruthi

  • Do you want the error message from the console where I say "no"?

  • Please find

     


    *** Creating database: test ***
    10.241.251.203 OK [vertica][(7, 1, 1)][000][x86_64]
    10.241.251.132 OK [vertica][(7, 1, 1)][000][x86_64]
    10.241.251.134 OK [vertica][(7, 1, 1)][000][x86_64]
    Checking full connectivity
    Creating database test
    Starting bootstrap node v_test_node0001 (10.241.251.203)
    Starting nodes:
    v_test_node0001 (10.241.251.203)

    Starting Vertica on all nodes. Please wait, databases with large catalogs may take a while to initialize.

    Node Status: v_test_node0001: (INITIALIZING)
    Node Status: v_test_node0001: (INITIALIZING)
    Node Status: v_test_node0001: (INITIALIZING)
    Node Status: v_test_node0001: (UP)
    Creating database nodes
    Creating node v_test_node0002 (host 10.241.251.132)
    Creating node v_test_node0003 (host 10.241.251.134)
    Generating new configuration information
    Stopping bootstrap node
    Starting all nodes
    Starting nodes:
    v_test_node0001 (10.241.251.203)
    v_test_node0002 (10.241.251.132)
    v_test_node0003 (10.241.251.134)

    Starting Vertica on all nodes. Please wait, databases with large catalogs may take a while to initialize.

    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Node Status: v_test_node0001: (INITIALIZING) v_test_node0002: (DOWN) v_test_node0003: (DOWN)
    Nodes DOWN: v_test_node0003, v_test_node0002, v_test_node0001 (may be still initializing).
    It is suggested that you continue waiting.
    Do you want to continue waiting? (yes/no) [yes] no
    ERROR: Not all nodes came up, but not all down. Run scrutinize.
    Press RETURN to continue

  • SruthiASruthiA Administrator

    Hi,

     

     Can you check if all nodes are UP? issue select * from nodes.

     

    -Regards,

     Sruthi

  • This is the problem that nodes are not coming up

     

    [root@ip-10-241-251-203 ~]# !99
    /opt/vertica/bin/vsql --no-vsqlrc -n -p 5433 -h '10.241.251.134' test
    vsql: FATAL 4149: Node startup/recovery in progress. Not yet ready to accept connections

  • SruthiASruthiA Administrator

    Hi,

     

      Can you disable the firewall as mentioned in the documentation and try creating a database. I think this should work

     

    https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/InstallationGuide/BeforeYouInstall/iptablesEnabled.htm%3FTocPath%3DInstallation%20Guide

     

     

    -Regards,

     Sruthi

  • iptables are not running on any node. What makes you believe it's a iptables issue?

  • SruthiASruthiA Administrator

    Hi,

     

      Since, nodes are not ready to accept connections. I thought there might be firewall issue blocking their communication. Can you check if spread is running on all nodes of your cluster?

     

    ps -ef | grep spread // issue this command on all three nodes of the cluster and share the output.

     

    Can you attach vertica.log as well?

     

    -Regards,

     Sruthi

  • [root@ip-10-241-251-203 ~]# for i in host1 host2 host3; do echo "----" ; ssh $i "ps -eaf|grep spread"; done
    ----
    root 1998 17903 0 20:01 pts/0 00:00:00 ssh host1 ps -eaf|grep spread
    root 2001 1999 0 20:01 ? 00:00:00 bash -c ps -eaf|grep spread
    root 2009 2001 0 20:01 ? 00:00:00 grep spread
    vertica 19067 1 0 17:51 pts/1 00:00:01 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0001_catalog/spread.conf
    ----
    vertica 16074 1 0 17:51 ? 00:00:01 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0002_catalog/spread.conf
    root 31023 31021 0 20:01 ? 00:00:00 bash -c ps -eaf|grep spread
    root 31031 31023 0 20:01 ? 00:00:00 grep spread
    ----
    vertica 15953 1 0 17:51 ? 00:00:01 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0003_catalog/spread.conf
    root 30890 30888 0 20:01 ? 00:00:00 bash -c ps -eaf|grep spread
    root 30898 30890 0 20:01 ? 00:00:00 grep spread
    [root@ip-10-241-251-203 ~]#

  • SruthiASruthiA Administrator

    Hi,

     

     I checked the vertica.log file and looks like there are some issues with spread.

     

    2015-06-17 17:51:48.370 Init Session:0x7f89f8010fc0-a000000000000d [Comms] <INFO> Sending reload command to spread
    2015-06-17 17:51:48.370 Poll dispatch:0x8005ef0 [Comms] <INFO> Sent reload command to spread daemon
    2015-06-17 17:51:50.371 Init Session:0x7f89f8010fc0-a000000000000d <WARNING> @v_test_node0001: 01000/4539: Received no response from v_test_node0002 in reload spread config
    2015-06-17 17:51:50.371 Init Session:0x7f89f8010fc0-a000000000000d <WARNING> @v_test_node0001: 01000/4539: Received no response from v_test_node0003 in reload spread config
    2015-06-17 17:51:50.371 Init Session:0x7f89f8010fc0-a000000000000d [Txn] <INFO> Rollback Txn: a000000000000d 'reloadSpreadConfig'
    2015-06-17 17:51:50.574 Init Session:0x7f89f8010fc0 <LOG> @v_test_node0001: 00000/4719: Session ip-10-241-251-203.u-18863:0x13 ended; closing connection (connCnt 1)

     

     

     

    Could you please check if the owner of vspread.conf is dbadmin?

     

    ls -l /opt/vertica/config/vspread.conf

     

    -Regards,

     Sruthi

     

     

  • Hi 

     

    If this db is currently down.I would suggest you to give it a try  start the nodes individually on all the three nodes using below command as an example:

     

    For e,g:

     

    /opt/vertica/bin/vertica -D /home/dbadmin/test_crane/v_test_crane_node0001_catalog -C test_crane -n v_test_crane_node0001 -h 10.50.52.41 -p 5433 -P 4803 -Y ipv4

     

     

    You can find similiar command in your vertica.log files at the beginning & can run that individually changing the corresponding ip & node name.

     

     

    Let me know if this helps.

     

    Regards

    Rahul Choudhary

  • this is the only vspread.conf I see:

     

    root@ip-10-241-251-203 ~]# ls -lltr /opt/vertica/agent/test/config/vspread.conf

    -rw-r--r--. 1 root root 222 Dec 17 2012 /opt/vertica/agent/test/config/vspread.conf

  • SruthiASruthiA Administrator

    Hi,

     

      if all nodes are UP, I would suggest you to try restarting spread using the following command on all nodes of your cluster

     

    sudo /etc/init.d/spreadd restart

     

    After restarting spread, create database.

     

     

    If the nodes are down, bring them UP by using the procedure what rahul has mentioned.

     

    -Regards,

     Sruthi

  • How can I tell what state I am in? These are the processes running:

     

    [root@ip-10-241-251-203 ~]# for i in host1 host2 host3
    > do
    > ssh $i "ps -eaf|grep vertica"
    > done
    vertica 3288 1 0 Jun16 ? 00:00:00 /bin/bash /opt/vertica/agent/agent.sh /opt/vertica/config/users/vertica/agent.conf
    vertica 3295 3288 0 Jun16 ? 00:08:08 /opt/vertica/oss/python/bin/python ./simply_fast.py
    vertica 19067 1 0 Jun17 ? 00:00:31 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0001_catalog/spread.conf
    vertica 19069 1 0 Jun17 ? 00:21:12 /opt/vertica/bin/vertica -D /home/vertica/test/v_test_node0001_catalog -C test -n v_test_node0001 -h 10.241.251.203 -p 5433 -P 4803 -Y ipv4
    vertica 19072 19069 0 Jun17 ? 00:00:32 /opt/vertica/bin/vertica-udx-zygote 12 10 19069 debug-log-off /home/vertica/test/v_test_node0001_catalog/UDxLogs
    root 27472 22884 0 13:41 pts/0 00:00:00 ssh host1 ps -eaf|grep vertica
    root 27475 27473 0 13:41 ? 00:00:00 bash -c ps -eaf|grep vertica
    root 27483 27475 0 13:41 ? 00:00:00 grep vertica
    vertica 2939 1 0 Jun16 ? 00:00:00 /bin/bash /opt/vertica/agent/agent.sh /opt/vertica/config/users/vertica/agent.conf
    vertica 2946 2939 0 Jun16 ? 00:00:44 /opt/vertica/oss/python/bin/python ./simply_fast.py
    vertica 16074 1 0 Jun17 ? 00:00:34 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0002_catalog/spread.conf
    vertica 16076 1 0 Jun17 ? 00:16:34 /opt/vertica/bin/vertica -D /home/vertica/test/v_test_node0002_catalog -C test -n v_test_node0002 -h 10.241.251.132 -p 5433 -P 4803 -Y ipv4
    vertica 16079 16076 0 Jun17 ? 00:00:33 /opt/vertica/bin/vertica-udx-zygote 12 10 16076 debug-log-off /home/vertica/test/v_test_node0002_catalog/UDxLogs
    root 23945 23943 0 13:41 ? 00:00:00 bash -c ps -eaf|grep vertica
    root 23953 23945 0 13:41 ? 00:00:00 grep vertica
    vertica 2930 1 0 Jun16 ? 00:00:00 /bin/bash /opt/vertica/agent/agent.sh /opt/vertica/config/users/vertica/agent.conf
    vertica 2937 2930 0 Jun16 ? 00:00:45 /opt/vertica/oss/python/bin/python ./simply_fast.py
    vertica 15953 1 0 Jun17 ? 00:00:29 /opt/vertica/spread/sbin/spread -c /home/vertica/test/v_test_node0003_catalog/spread.conf
    vertica 15955 1 0 Jun17 ? 00:14:59 /opt/vertica/bin/vertica -D /home/vertica/test/v_test_node0003_catalog -C test -n v_test_node0003 -h 10.241.251.134 -p 5433 -P 4803 -Y ipv4
    vertica 15958 15955 0 Jun17 ? 00:00:30 /opt/vertica/bin/vertica-udx-zygote 12 10 15955 debug-log-off /home/vertica/test/v_test_node0003_catalog/UDxLogs
    root 23827 23825 0 13:41 ? 00:00:00 bash -c ps -eaf|grep vertica
    root 23835 23827 0 13:41 ? 00:00:00 grep vertica
    [root@ip-10-241-251-203 ~]#

     

     

  • SruthiASruthiA Administrator

    HI,

     

    You can know it in either of the two ways

     

    1)   Go to vsql and issue the query

     

    select * from nodes. It will show you the state of all nodes in the vertica cluster.

     

    2) Open admintooks and click on "View Database Cluster State", it will show you the status of all the nodes in the vertica cluster.

     

    -Regards,

     Sruthi

  • I am unable to determine the status:

     

    1) When I execute adminTools -> View Status it immediately exits:

     

    adminTools Last Chance Error Handler running...
    raised error: <class 'ConfigParser.NoOptionError'>
    error message: No option 'v_test_node0001' in section: 'Nodes'
    trace file: /opt/vertica/log/adminTools-vertica.errors
    REPORT THIS INFORMATION TO TECHNICAL SUPPORT
    AND INCLUDE CONTENTS OF THE TRACE FILE IN YOUR REPORT

     

    2) when I type vsql I get:

     

    [vertica@ip-10-241-251-220 ~]$ vsql
    vsql: FATAL 4149: Node startup/recovery in progress. Not yet ready to accept connections
    [vertica@ip-10-241-251-220 ~]$

  • SruthiASruthiA Administrator

    Hi,

     

      Can you share admintools.conf file from 3 nodes. it is present in the directory /opt/vertica/config

     

    From any node, after you login please type the follow and share me the output

     

    rpm -qa|grep vertica

     

     

    -Regards,

     Sruthi

  • [vertica@ip-10-241-251-220 ~]$ rpm -qa|grep vertica
    vertica-7.1.1-0.x86_64

     

    As for conf, it is in 3 locations. Which one should I attach?

     

    /opt/vertica/agent/test/support/config/admintools.conf
    /opt/vertica/agent/test/config/admintools.conf
    /opt/vertica/config/admintools.conf

  • SruthiASruthiA Administrator

    Hi,

     

      COuld you please share the one from /opt/vertica/config/admintools.conf

     

     

    -Regards,

     Sruthi

  •  

    [root@ip-10-241-251-220 ~]# cat /opt/vertica/config/admintools.conf
    [Configuration]
    last_port = 5433
    tmp_dir = /tmp
    default_base = /home/dbadmin
    format = 3
    install_opts = -s 'host1,host2,host3' -r './v/vertica-7.1.1-0.x86_64.RHEL5.rpm' -u vertica --failure-threshold NONE
    spreadlog = False
    controlsubnet = default
    controlmode = broadcast

    [Cluster]
    hosts = 10.241.251.220,10.241.251.144,10.241.251.145

    [Nodes]
    node0001 = 10.241.251.220,/home/vertica,/home/vertica
    node0002 = 10.241.251.144,/home/vertica,/home/vertica
    node0003 = 10.241.251.145,/home/vertica,/home/vertica

    [Database:test]
    restartpolicy = ksafe
    port = 5433
    path = /home/vertica/test/v_test_node0001_catalog
    nodes = v_test_node0001,v_test_node0002,v_test_node0003

     

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file