We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Unable to start db — Vertica Forum

Unable to start db

mpompo Vertica Customer

Hi,
I just setup three node test cluster (Vertica 10 Community Edition) on Linux vm. I used CentOS images managed by Oracle VirtualBox. Machines communicate each other and apparently everything is ok at OS level.
Vertica installation went fine, but finally I can not start freshly created database.

Using adminTools I have:
*** Starting database: kaka ***
Starting nodes:
v_kaka_node0001 (192.168.1.201)
v_kaka_node0002 (192.168.1.202)
v_kaka_node0003 (192.168.1.203)
Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
Node Status: v_kaka_node0001: (DOWN) v_kaka_node0002: (DOWN) v_kaka_node0003: (DOWN)
...then I wait few minutes and get:
Nodes in TRANSITIONAL state: 192.168.1.201, 192.168.1.203, 192.168.1.202
Nodes DOWN: v_kaka_node0001, v_kaka_node0002, v_kaka_node0003 (may be still initializing).
Server startup was successful on some nodes, but not complete

In adminTools.log I have suspicious message:
2020-07-15 14:05:16.465 at_exec/6637:0x7ffb6e408740 [ATRunner.exec_module] running: module=vertica.engine.api.db_client.module version=1.0 args={"description": "ge
t cluster status", "cluster_status": true, "database": "kaka", "port": 5433}
2020-07-15 14:05:16.487 at_exec/6637:0x7ffb6e408740 [ATRunner.exec_module] result: status=Failure host=None content={"description": "get cluster status", "failure_r
eason": "ConnectionError: Failed to establish a connection to the primary server or any backup address.", "failure_operation": "connect-secure", "failure_details": {"stack": "Traceba
ck (most recent call last):\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica/engine/api/db_client/executor.py\", line 133, in _run_cluster_status\n conn = co
nn_helper.make_connection()\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica/engine/api/db_client/connection_helper.py\", line 149, in make_connection\n conn
= self.try_secure()\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica/engine/api/db_client/connection_helper.py\", line 180, in try_secure\n return self._fin
alize_conn(self.conn_method(**args))\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica/engine/api/db_client/cluster_status.py\", line 66, in cluster_status_conne
ct\n return ClusterStatusConnection(kwargs) # type: ignore\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica_python/vertica/connection.py\", line 280, in __in
it__\n self.startup_connection()\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica_python/vertica/connection.py\", line 609, in startup_connection\n self.w
rite(messages.Startup(user, database, session_label, os_user_name))\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica/engine/api/db_client/cluster_status.py\", line 72, in write\n return super().write(message)\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica_python/vertica/connection.py\", line 499, in write\n sock = self._socket()\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica_python/vertica/connection.py\", line 361, in _socket\n raw_socket = self.establish_connection()\n File \"/opt/vertica/oss/python3/lib/python3.7/site-packages/vertica_python/vertica/connection.py\", line 480, in establish_connection\n raise errors.ConnectionError(err_msg)\nvertica_python.errors.ConnectionError: Failed to establish a connection to the primary server or any backup address.\n", "type": "", "message": "Failed to establish a connection to the primary server or any backup address."}, "runner_ack": true} error_message=None

dbLog looks ok:
Conf_load_conf_file: using file: /home/dbadmin/kaka/v_kaka_node0001_catalog/spread.conf
Conf_load_conf_file: vertica version is 7
Setting active IP version to 2
Configured daemon 'N192168001201' with IP '192.168.1.201'
Auto-generated virtual ID = '3372329152' for daemon 'N192168001201'
Daemon 'N192168001201' will have virtual ID = '3372329152'
Configured daemon 'N192168001202' with IP '192.168.1.202'
Auto-generated virtual ID = '3389106368' for daemon 'N192168001202'
Daemon 'N192168001202' will have virtual ID = '3389106368'
Configured daemon 'N192168001203' with IP '192.168.1.203'
Auto-generated virtual ID = '3405883584' for daemon 'N192168001203'
Daemon 'N192168001203' will have virtual ID = '3405883584'
Successfully configured Segment 0 [192.168.1.255]:4803 with 3 procs:
N192168001201: 192.168.1.201
N192168001202: 192.168.1.202
N192168001203: 192.168.1.203
Connected to spread on local domain socket /opt/vertica/spread/tmp/4803
auto restart closing socket

startup.log ends with:
{
"node" : "v_kaka_node0001",
"stage" : "Waiting for Cluster Invite",
"text" : "Prepare to be invited",
"timestamp" : "2020-07-15 13:54:54.001"
}
{
"node" : "v_kaka_node0001",
"stage" : "Waiting for Cluster Invite",
"text" : "Ready to be invited",
"timestamp" : "2020-07-15 13:54:54.002"
}

vertica.log ends with normal messages:
2020-07-15 14:25:48.000 Timer Service:0x7fb2f27fc700 [Util] Task 'FeatureUseLogger' enabled
2020-07-15 14:25:48.000 Timer Service:0x7fb2f27fc700 [Util] Task 'LicenseSizeAuditor' enabled
2020-07-15 14:25:48.000 Cluster Inviter:0x7fb2d9ffb700 [Comms] My global sequence value is 136008

I checked netstat and it looks fine:
[dbadmin@cent1 v_kaka_node0001_catalog]$ netstat -utln|grep 5433
udp 0 0 192.168.1.201:5433 0.0.0.0:*

Where should I check for the problem?

Best Answer

Answers

  • LenoyJLenoyJ - Select Field - Employee
    edited July 2020

    Other than ports and firewall, since this a virtualized environment, check if you had enabled point-to-point communication when you had created your cluster.

  • mpompo Vertica Customer

    @LenoyJ said:
    Other than ports and firewall, since this a virtualized environment, check if you had enabled point-to-point communication when you had created your cluster.

    Hi,
    I installed it using "install_vertica --hosts --rpm " only.
    Since all hosts are in this same subnet, I ignored this option.
    Should I reinstall with '-T' option or I can reconfigure it?
    I have no firewalls. Will install nmap to test udp traffic.
    Regards

  • LenoyJLenoyJ - Select Field - Employee
    edited August 2020

    For --point-to-point, the docs say:

    Also use this option for all virtual environment installations, whether the virtual servers are on the same subnet or not.

    You could try re-configuring using update_vertica and setting parameters -s -r -T and -S. But I would just simply reinstall.

  • mpompo Vertica Customer

    Hi,
    I did it :)
    Indeed it was necessary to switch off default firewalld
    systemctl mask firewalld
    systemctl disable firewalld
    systemctl stop firewalld

    Before:
    [root@cent2 centos]# nmap -sU -p 5433 192.168.1.202
    Starting Nmap 7.70 ( https://nmap.org ) at 2020-07-17 11:34 CEST
    Nmap scan report for cent2 (192.168.1.202)
    Host is up (0.000027s latency).

    PORT STATE SERVICE
    5433/udp closed pyrrho


    After:
    [root@cent2 centos]# nmap -sU -p 5433 192.168.1.202
    Starting Nmap 7.70 ( https://nmap.org ) at 2020-07-17 11:39 CEST
    Nmap scan report for cent3 (192.168.1.203)
    Host is up (0.00032s latency).

    PORT STATE SERVICE
    5433/udp open|filtered pyrrho

    It was not necessary to reinstall with "point-to-point" option.
    Thank you for your help.
    Regards

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file