How Insert / Updates work in vertica in detail

I need some clarity on how any insert / update gets written to WOS and propagated to copies ? 1. Is WOS is single per cluster or per node. ( I assume it is per node as per shared nothing architecture ) 2. If I insert row with id 1 on node 1, it gets written to WOS on node 1. Do i get commit acknowledgement right away ? How and when its replicated to other nodes. By the time it gets replicated ( but committed to node1 ), do queries on those replica return obsolete data ? 3. What if my row with id 1 on node 1 committed successfully on WOS but yet not moved to ROS and not yet replicated to other replicas. Meanwhile what if node 1 goes down ?

Comments

  • Hi Amit, thanks for reaching out! I have passed your question along to our support team.
  • 1. Is WOS is single per cluster or per node. ( I assume it is per node as per shared nothing architecture ) A: There is a WOS on each node. 2. If I insert row with id 1 on node 1, it gets written to WOS on node 1. Do i get commit acknowledgement right away ? How and when its replicated to other nodes. By the time it gets replicated ( but committed to node1 ), do queries on those replica return obsolete data ? A: When you insert data into a projection, is parsed, segmented (if needed) and placed into WOS (or ROS in case of DIRECT; load). it is placed into WOS on every node in the cluster that contains the segment you are inserting to, including replicas, at the same time. At that point the data is accessible, and all nodes have a consistent copy of the data, so there will be the same results regardless of the data. 3. What if my row with id 1 on node 1 committed successfully on WOS but yet not moved to ROS and not yet replicated to other replicas. Meanwhile what if node 1 goes down ? A: As per #2, there is no point where the data is committed successfully on one node but not on the other. As a note, default is that MoveOut (data is moved from WOS to ROS) every 5 minutes or when WOS reaches a certain fill level. So if you lost all nodes simultaneously, there would be data loss of whatever was in WOS, but it would be limited to the data loaded since the last MoveOut.
  • Hi,
    I have a question along the same lines. If I have 3 tables A B and C
    1.  I do a big insert on A(without direct hint )  and commit it. Lets assume its a big dml so the move-out process is taking a while
    2. Meanwhile , I insert into table B some rows that I select from table A(assuming table A rows are now available for querying since I committed it, even though they might not all be in ROS yet)
    I use a DIRECT hint this time and the commit it. 
    3. I create a third table C, and just insert some unrelated rows in it using DIRECT hint. commit it.

    All this while DML  for A is still running.
    My questions:
    IF I lose all my nodes now, when I bring my DB back up:
    1. What will be the state of A,B and C.
    2. Will C have all data that I inserted into it.
    3. Will B have all the data that I inserted, including the ones that were committed for A, but the move-out didnt move it to ROS.
    4. where will be the last good epoch.

    Thanks for the explanation!
    Dgh

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file