Shortening the catalog sync interval?

What you think of shortening the catalog sync interval to 1 minute or even less?
Especially in case of only few DDLs are executed during day....

Question asked at the 2020 Big Data Conference presentation Sizing and Configuring Vertica in Eon Mode for Different Use Cases given by @skeswani and @skamat .

Answers

  • skeswaniskeswani - Select Field - Employee

    This is not really necessary.
    1. the chance that a node goes down is low/rare
    2. even if it did, there is a seconday node which has the catalog changes. the secondary will persist these changes to s3
    3. If both primary and secondary go down (very very rare) and the catalog was on a EBS volume , It will be synced on next restart (assuming delete on termination of catalog vol is disabled)

    The disadvantage of making it every 1 min is you are writing the entire Tx log to s3 every 1 min, and the Tx log can get large (i.e. about 1/2 the size of the catalog on disk). By making it every 1 min you are creating a lot of network and i/o load all the time for insurance against a very rare event (i.e. two nodes getting terminated including termination of their volumes) to ensure against a minimal loss (5 min).

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file