vbr configuration for incremental backup

mpompo Vertica Customer

Hi,
I try to configure backup strategy like: one monthly full backup + daily incrementals.
I put in backup ini file parameter restorePointLimit = 30 but every day I see full backups created.
Documentation is not very helpful in this subject.
https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/AdministratorsGuide/BackupRestore/RepeatingBackups.htm?tocpath=Administrator's Guide|Backing Up and Restoring the Database|Creating Backups|_____5
I'm using vertica 9.3 if this makes difference.
What is wrong with my setup?

Best Answer

Answers

  • LenoyJLenoyJ - Select Field - Employee
    edited July 2020

    The restorePointLimit parameter is used for point-in-time backup\restore. This is probably why there are full backups being created. That is, if you have it set at 30 and running it once a day, you can return your database to how it was to any day in the last 30 days. For example, if you want to return the database to how it was on 07/01/2020, you can use the --archive parameter while calling restore and set the appropriate date (in this case 20200701_xyz) to do so (you can use the listbackup task to list all backups to get the correct backup name)
     
    For your situation, and if you want the point in time recovery to be once a month, I would recommend creating two config files:

    • Let the first config file handle your daily backups. So set restorePointLimit to be the default (1) and run this once a day. The backups will be incremental (as long as you use the same config file). If something goes wrong today you can restore the last day's backup.
    • Let the second config file handle your monthly full backups. Set restorePointLimit to be 12 and run it once a month. After 12 months, you'll have 12 point-in-time full backups (one for each month you ran it).

    If you don't care for point-in-time backups, just use the first config file. There may be other ways to do this but that's my 2 cents. :smile:

  • mpompo Vertica Customer

    Hi,
    Thank you for the answer.
    I'm very fresh to Vertica, so maybe I don't understand the concepts.
    I try to have situation, when I have full backup (created first day of the month as a baseline) and daily incrementals (to save disk space). I need also possibility to restore database not only for the first day of the month, but any day when these incrementals were created.
    Next month I need a new full backup and new incrementals until the end of month. Previous backups can be discarded.
    Is it possible?

    Now, when I have restorePointLimit=1 I have just two FULL backups (not incremental, like you wrote).
    When I set restorePointLimit=30 I also have FULL backups.

    -bash-4.1$ vbr -t listbackup -c backup_test.ini
    backup backup_type epoch
    Backup_Incr_20200713_220004 full 80553850
    Backup_Incr_20200712_220003 full 80386579
    Backup_Incr_20200711_220004 full 80359377
    Backup_Incr_20200710_220004 full 80331626
    ...and so on.

  • LenoyJLenoyJ - Select Field - Employee
    edited July 2020

    The definition of a full backup is as listed on the docs: https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/AdministratorsGuide/BackupRestore/TypesOfBackups.htm
     
    Pay attention to this line:

    When a full backup already exists, vbr backs up new or changed data since the last full backup occurred, rather than making another complete copy. You can specify the number of historical backups to keep.

    listbackup says it's a "full" backup because it's a full backup as per Vertica definition. But that does not mean when you run it twice it will backup all files twice at the file system level. Full backups using the same config file are always incremental and only copies the deltas. Let's take an example. I have a database with approximate data size on node 1 as below:

    dbadmin=> SELECT (SUM(used_bytes)/1024/1024/1024)::integer as used_bytes_gb FROM storage_containers WHERE node_name='v_lenoy_ent_3n1_node0001';
     used_bytes_gb
    ---------------
                 2
    (1 row)
    

    Now I created a config file with restorePointLimit as 2.

    [Mapping]
    v_lenoy_ent_3n1_node0001 = []:/home/dbadmin/backups
    v_lenoy_ent_3n1_node0002 = []:/home/dbadmin/backups
    v_lenoy_ent_3n1_node0003 = []:/home/dbadmin/backups
    
    [Misc]
    snapshotName = incremental_snapshot
    
    [Database]
    dbName = lenoy_ent_3n1
    
    [Misc]
    restorePointLimit = 2
    

    I ran the backup. And listbackup now shows something like the following:

    $ vbr -t listbackup -c forum_backup.ini
    backup                                 backup_type   epoch
    incremental_snapshot_20200714_160858   full          69
    

    Let's check the backup directory size:

    $ du -sh /home/dbadmin/backups
    2.5G    /home/dbadmin/backups
    

    It's backed up 2.5gb (data + catalog). Let me run the backup again without adding any new data with the same config file.

    $ vbr -t backup -c forum_backup.ini
    Starting backup of database lenoy_ent_3n1.
    Participating nodes: v_lenoy_ent_3n1_node0001, v_lenoy_ent_3n1_node0002, v_lenoy_ent_3n1_node0003.
    Snapshotting database.
    Snapshot complete.
    Approximate bytes to copy: 0 of 7896955165 total.
    Copying backup metadata.
    Finalizing backup.
    [==================================================] 100%
    Backup complete!
    

    Noticed it copied 0 bytes even though it is a full backup with restorePointLimit of 2? Let's check the size at the directory level again.

    $ du -sh /home/dbadmin/backups
    2.5G    /home/dbadmin/backups
    

    It's still the same size. And listbackup has two full backups\restore points, you can restore back to any of them:

    $ vbr -t listbackup -c forum_backup.ini
    backup                                 backup_type   epoch
    incremental_snapshot_20200714_161329   full          69
    incremental_snapshot_20200714_160858   full          69
    

    Hope that helps to understand that full backups (as per Vertica definition) are always incremental and you can restore back to any of the backups you took.
     
    Now, you said:

    @mpo said:
    When I have full backup (created first day of the month as a baseline) and daily incrementals (to save disk space). I need also possibility to restore database not only for the first day of the month, but any day when these incrementals were created.
    Next month I need a new full backup and new incrementals until the end of month. Previous backups can be discarded.

    In this case, setting restorePointLimit to 30 is accurate. If you run it once a day, you will be able to restore to any one day in the last 30 days. After the 31st run, the oldest backup will be removed.

  • mpompo Vertica Customer

    Hi,
    This is an comprehensive answer!
    I was misleaded by this 'full' word when I run vbr -t listbackup. I expected something like 'incr'.
    Just for the curiosity: In which case vbr reports other backup_type field?

  • LenoyJLenoyJ - Select Field - Employee

    The ones that show up are full backups, object level backups and hard-link local backups. vbr in Enterprise mode also supports replicating objects and copying an entire cluster from one to another - but these won't be listed in listbackup as they aren't backups that you can restore back to...

  • Girish_NanjappaGirish_Nanjappa Vertica Customer

    LenoyJ : Thanks for the good explanation. However, I'm curious about how Vertica handles restore point limit. I mean, if set my restorePointLimit parameter 7 and take a backup, the first time will be "full_backup" and the following will be incremental, but on the 8th day, when it has to remove the oldest backup, would it remove the "full_backup"? If yes, how can the backup be a complete backup?

  • LenoyJLenoyJ - Select Field - Employee
    edited April 2021

    @Girish_Nanjappa, good question. When you run a backup, vbr creates a "manifest" which contains a list of all the files vbr needs for restoring to that restore point. When the time comes to remove the oldest backup, vbr will only remove the data files that no other restore point is referencing. That way your second oldest restore point will be your complete backup.

    You can go into your backup directory and look inside these manifest files for yourself. There will be a "backup manifest" which contains all files referenced by all restore points. And there will be "snapshot manifests", one for each restore point that contains all the files needed for that particular restore point.

    Backup manifest:

    $ pwd
    /home/dbadmin/backups
    $ ls *manifest*
    backup_manifest
    
    

    One of my Snapshot manifests:

    $ pwd
    /home/dbadmin/backups/Snapshots/incremental_snapshot_20200714_160858/v_lenoy_ent_3n1_node0001
    $ ls *manifest*
    incremental_snapshot.manifest
    
    
  • gvishal1331gvishal1331 Vertica Customer
    Can anybody please share vertica db backup and restore scripts?


    Thanks in advance.
  • gvishal1331gvishal1331 Vertica Customer
    Thanks @vertica_Curtis
    Can you also please assist me, how i can schedule the backup and restoration process in crontab. Please share steps.
    I'm new in vertica db, i just start learning.
  • gvishal1331gvishal1331 Vertica Customer
    Hello,
    please guide me how i can setup vertica db schema replication.
    I have 3 node server on DC location and 3 node server on DR location.
    I'm want to replicate my vdu user from dc to dr and dr to dc.
    ----
    1. Both locations are piing and reachable to each other
    2. Both location have setup ssh passwordless configuration.
    3. Both locations have samen db name and user name. And same node name with different IP address.


    Thanks in advance.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file