Cannot recover yet, some non-current nodes have LGE behind
Hi,
I'm seeing startup.log throwing the below error. Due to the urgency, the cluster was restored using LGE.
a) Under what scenario do we see the below message and what should be the ideal troubleshooting steps to perform?
b) Was restoring LGE the right thing to do or was there a better solution than this?
tail -f startup.log
_________________________________________________________________________________________________"stage" : "Plan Recovery",
"text" : "Cannot recover yet, some non-current nodes: node0002: LGE is behind: 214191296 instead of 214199608: ",
"timestamp" : "2020-11-09 05:03:58.447"
}
{ "node" : "node0002", "stage" : "Plan Recovery", "text" : "Cannot recover yet, some non-current nodes: node0002: LGE is behind: 214191296 instead of 214199608: ", "timestamp" : "2020-11-09 05:04:03.448" }
Best Answers
-
Nimmi_gupta - Select Field - Employee
@karthik
The message LGE is behind: 214191296 instead of 214199608, indicating there are some projections holding epoch 214191296 and
some projections holding epoch 214199608.
[1]
Run the below query to Identify impacted projections
SELECT e.node_name, t.table_schema, t.table_name, e.projection_schema, e.projection_name, checkpoint_epoch FROM projection_checkpoint_epochs e, projections p, tables t WHERE e.projection_id = p.projection_id and p.anchor_table_id = t.table_id and is_behind_ahm ;
[2]
Once you find the projections and decided to bring the DB up from higher epoch you need to run abortrecovery on those projection.
You can use the below query and that will create the abort command.
select 'select do_tm_task('||'''abortrecovery'''||','||''''||projection_schema||'.'|| projection_name ||''');' as command from (select distinct projection_schema, projection_name from v_catalog.projection_checkpoint_epochs
where checkpoint_epoch = ) as v1;5
Answers
@karthik : Please open a support case. since we need to review logs to perform the RCA.
Thanks Sruthi. Will submit one