Options

Why "Finalizing object replication" takes huge time ?

veerkumarveerkumar Vertica Customer

we have cluster of 3 nodes at primary site and cluster of 3 nodes at disaster recovery site. replication to DR site takes huge time varying from 1 hour to 2.5 hours. we want to minimize that time.
Replication is delta replication by vertica utility VBR. Replication is scheduled by cronjob which is set at every half hour (0th,30th mint). in the VBR log file it says (as below)
_* logs is below vbr_2020-04-09-000001.log

  • 2020-04-09 00:01:41 localhost progress Progress: 0 out of 68029253339 bytes (0%)
  • 2020-04-09 00:01:44 localhost progress Progress: 228498913 out of 68029253339 bytes (0%)
  • 2020-04-09 00:02:15 localhost progress Progress: 494022162 out of 68029253339 bytes (0%)
  • 2020-04-09 00:02:32 localhost progress Progress: 726786788 out of 68029253339 bytes (1%)
  • 2020-04-09 00:02:35 localhost progress Progress: 964304364 out of 68029253339 bytes (1%)
  • 2020-04-09 00:02:56 localhost progress Progress: 1557458820 out of 68029253339 bytes (2%)
  • 2020-04-09 00:02:58 localhost progress Progress: 1833354886 out of 68029253339 bytes (2%)
  • 2020-04-09 00:03:13 localhost progress Progress: 2062297903 out of 68029253339 bytes (3%)
  • 2020-04-09 00:03:31 localhost progress Progress: 2340905374 out of 68029253339 bytes (3%)
  • 2020-04-09 00:03:49 localhost progress Progress: 2569460483 out of 68029253339 bytes (3%)
  • 2020-04-09 00:04:15 localhost progress Progress: 2796810081 out of 68029253339 bytes (4%)_
    and so on. the above transfer takes 40-60 mints thats okay.
    then it gets stuck at some process. in log is says "Finalizing object replication". this takes 1 to 1.5 hours itself. why it takes much more time than the actual transfer ? can we speed this up ? what does this actually mean ?
    regards,
    Veer

Answers

  • Options
    jheffnerjheffner Employee

    What version of Vertica are you running?
    Often the most expensive parts of the finalization is for the server to that any newly-replicated storage has been copied, which can involve lots of stats on the local filesystem if lots of new storage has been replicated. However, this generally should take significantly less time than the actual data replication.
    One other place to investigate would be the vertica.log on the node that vbr connects to. You should see a transaction that runs the metafunction load_snapshot. See if anything out of the ordinary is reported there.

  • Options
    veerkumarveerkumar Vertica Customer

    Thanks jheffner for response.
    Vertica version: Vertica Analytic Database v8.1.1-27
    i am looking into vertica.log. and i think vbr connects to all six nodes (3 primay, 3 disaster nodes) but i am not sure about that.
    let me see if anything out of the ordinary is reported there, will get back to you with some findings.
    Thanks and regards.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file