Test and Dev Node is not starting

veerkumarveerkumar Vertica Customer
edited March 2021 in General Discussion

we have one test and dev node. recently server rebooted and since then it is not starting up.
when checked in logs, we have seen LGE and tried to restore from LGE, but not success.
tail Epoch.log
Last good epoch: 0x1274a8f ended at '2000-01-01 05:00:00+05'
Last good catalog version: 0x19f0a39
K-safety: 0
AHM: 0x19b80d5 ended at '2021-01-10 01:50:03.249346+05'

we tried
$admintools -t restart_db -d <> -p <> --epoch 19352207
It throws following Error
Invalid value for last good epoch: '19352207'
Epoch number must be between 26968277 and 19352207 inclusive

also tried to do force restart
$admintools -t start_db -d <> -p <> --force
but no luck. when check startup.log. then found error: ASR required
we have observed such error in three node cluster and restored that using make_ahm_now(true); when two nodes were up and one was down in cluster.
but here we have only one node in cluster as it is test and dev.
so how do we use make_ahm_now(true) in one node cluster (test and dev) ?
or how to up this node ?
we don't care about restoration of data because it is test and dev environment.
any help would be appreciated.

Best Answer

  • Nimmi_guptaNimmi_gupta - Select Field - Employee
    Answer ✓

    Is it one node cluster?
    DB not coming up as AHM>LGE. There is some projection holding the epoch LGE 19352207which is less then AHM epoch 26968277.
    Last good epoch: 0x1274a8f = 19352207
    AHM: 0x19b80d5 =26968277
    Follow the below steps to bring the DB up.
    [1] First, start the db in unsafe mode
    admintools -t start_db -d -U
    [2] Run the below query to find the list of projections behind AHM.
    SELECT e.node_name, t.table_schema, t.table_name, e.projection_schema, e.projection_name, checkpoint_epoch
    FROM projection_checkpoint_epochs e, projections p, tables t
    WHERE e.projection_id = p.projection_id and p.anchor_table_id = t.table_id
    and not (is_temp_table) and is_behind_ahm and e.is_up_to_date;
    [3] Once you find the projection holding the epoch < AHM abort it by running the below query.
    SELECT do_tm_task('abortrecovery', '');
    [4] shutdown the database which started in unsafe mode.
    [5] Start the DB in normal mode 4.

Answers

  • veerkumarveerkumar Vertica Customer

    Thank you for response.
    suggested steps worked. we found four projections having LGE less than AHM. we aborted the recovery and then db restarted smoothly.
    Thank you

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file