Node down in 18 node cluster

JosephJoseph Vertica Customer
edited March 2021 in General Discussion

One of our node in 18 node cluster is down due to some os issues. Although we have identied the problem
but till then we are operational on 17 node cluster with K safety 1.
I have some follow up questions -
1)Will their be data loss if the buddy node also goes down.
2)Does Rebalance makes sense for 17 nodes without removing the down node as our high load etl are getting stucked?

Best Answers

Answers

  • moshegmosheg Vertica Employee Administrator

    Q - Will their be data loss if the buddy node also goes down?
    A - Committed data will not lost.
    While node 18 is down, if also the buddy node will go down the database is considered unsafe and automatically shuts down.

    Q - Does Rebalance makes sense for 17 nodes without removing the down node as our high load etl are getting stucked?
    A - No, it is advised to create a Standby Node
    See: https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/AdministratorsGuide/ManageNodes/CreatingAnActiveStandbyNode.htm

  • JosephJoseph Vertica Customer

    Thanks @mosheg for clarifying above questions-
    I have some more questions,please help.
    1)Node was down due to issues from os side.Now we have manged to bring it back.
    But what we have observed that when the node is down high load etls and other reporting are stuck on other 17 nodes.
    Is there any way when one node is down and till it gets resolved other 17 nodes function optimally so that our business is not impacted.

    2)Also is it possible to remove down node?

    3)We have Ksafety 1 and 20 TB licensce allowed. We use almost 18 TB of the allowed license.
    I suppose increasing Ksafety will not be possible if we want to keep the data under compliance norms?

  • JosephJoseph Vertica Customer

    Also one more additional question-
    If in future we suspect problem of os on any node snd before its going down we wan to remove it from cluster and rebalance data.
    Will this process require downtime from all loads,etls,reporting and end users in DB.

  • JosephJoseph Vertica Customer
    edited March 2021

    Thanks @mosheg you suggestions and suggested links gonna make our life easy.
    I am definately going to open a support cas for further evaluation.
    Just 2 last questions--a) when you say check how your data is disributed ,you mean check segmentations, rebalance and data skew? (and we use vertica 9.1 ...i think it is good enough mate).
    b)As per Sumeet's link we need to have a standby node ready in different single node cluster to tackle this situation,
    however as per doc we can also achieve the same keeping one standby in our 18 node cluster...right?

  • JosephJoseph Vertica Customer

    Thanks @mosheg

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file