Options

Starting database at AWS stuck in INITIALIZING

Hi,

I have created Vertica 10 nodes Cluster at AWS (7.1.1-10) and I did CopyCluster from production environment to AWS.
42TB … finished after 10 days.

Now after the CopyCluster finished, I'm trying to start the database from the admintools:
*** Starting database: vertica_polonium_mtc ***
Starting nodes:
v_vertica_polonium_mtc_node0001
v_vertica_polonium_mtc_node0002
v_vertica_polonium_mtc_node0003
v_vertica_polonium_mtc_node0004
v_vertica_polonium_mtc_node0005
v_vertica_polonium_mtc_node0006
v_vertica_polonium_mtc_node0007
v_vertica_polonium_mtc_node0008
v_vertica_polonium_mtc_node0009
v_vertica_polonium_mtc_node0010

Starting Vertica on all nodes. Please wait, databases with large catalogs may take a while to initialize.

Node Status: v_vertica_polonium_mtc_node0001: (INITIALIZING) v_vertica_polonium_mtc_node0002: (INITIALIZING)
v_vertica_polonium_mtc_node0003: (INITIALIZING) v_vertica_polonium_mtc_node0004: (INITIALIZING)
v_vertica_polonium_mtc_node0005: (INITIALIZING) v_vertica_polonium_mtc_node0006: (INITIALIZING)
v_vertica_polonium_mtc_node0007: (INITIALIZING) v_vertica_polonium_mtc_node0008: (INITIALIZING)
v_vertica_polonium_mtc_node0009: (INITIALIZING) v_vertica_polonium_mtc_node0010: (INITIALIZING)
...
It is suggested that you continue waiting.
Do you want to continue waiting? (yes/no) [yes] no
Database startup successful, but it may be incomplete. Some nodes
remain in a transitional state. Waiting can sometimes resolve this
problem. See Database Cluster State in Main Menu for updated
cluster state. If this state persists, try using the Advanced menu
to Stop Vertica on Host, then Restart Database.
Press RETURN to continue


The INITIALIZING running for several hours and didn't end.
I didn't find anything in the vertica.log or in the adminTools-dbadmin.log (attached) or in any log ... everything seems ok :(
I checked the admintools.conf (all right and the same on all nodes), the spread is running on all nodes (when the start database command is running), restart the cluster ... and again, everything seems ok, so what is the problem? why the nodes don't come UP ?

Anyone can help ?

Comments

  • Options

    Hi ,

    Check Your dbLog file , many time it include useful information which is related to database startup

     

    Thanks 

  • Options

    Checked...nothing :(

     

    dbLog:

     

    Conf_load_conf_file: using file: /db/catalog/vertica_polonium_mtc/v_vertica_polonium_mtc_node0001_catalog/spread.conf
    Setting active IP version to 0
    Successfully configured Segment 0 [10.99.87.255]:4803 with 10 procs:
    N010099084071: 10.99.85.71
    N010099084182: 10.99.85.182
    N010099084184: 10.99.85.184
    N010099084212: 10.99.85.212
    N010099085111: 10.99.86.111
    N010099085231: 10.99.86.231
    N010099086170: 10.99.85.170
    N010099086234: 10.99.84.234
    N010099086244: 10.99.85.244
    N010099087103: 10.99.84.103
    Connecting to spread at 4803
    Connected to spread on local domain socket 4803
    Starting UDxSideProcess for language C++
    with command line: /opt/vertica/bin/vertica-udx-C++ 3 ip-10-99-85-111-22634:0x2 debug-log-off /db/catalog/vertica_polonium_mtc/v_vertica_polonium_mtc_node0001_catalog/UDxLogs
    03/07/16 17:54:07 SP_connect: DEBUG: Auth list is: NULL
    03/07/16 17:54:07 SP_connect: connected with private group(21 bytes): #node_a#N010099085111; mbox=9, pid=22634
    03/07/16 17:56:09 SP_disconnect: mbox=9, pid=22634, send_group=#node_a#N010099085111

  • Options

    Fixed :)

    The source cluster catalog was at Broadcast and target cluster (AWS) catalog was at pt2pt (point to piont).
    With help from HP support, we edit the catalog of the target cluster to pt2pt

    (the AWS created with pt2pt parameter, but the copycluster command overwrite the configuration and update it to Broadcast)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file