backup multi-node cluster management

Pietro_La_Torre · December 2013

Hi,
I have a multi-node cluster with 3 nodes. I am planning a backup procedure.
What are the best practices with cluster snapshots?
Is it better to store all the snapshot information into one node or a slice of the database locally on each node?

If I would like to have the possibility to copy the snapshots in external disks, must I store all the data in only one node?

What are pros and cons of local copies on each node against one single copy in only one node?

Thanks in advance,
Pietro.

Raul_1 · October 2014

The documentation makes the following recommendation -
Note: If possible, for optimal network performance when creating a backup, HP Vertica recommends having each node in the cluster use a dedicated backup host.
http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/AdministratorsGuide/BackupRestore/ConfiguringBackupHosts.htm?Highlight=backup

Pietro_La_Torre · April 2015

hi, i have a question about backup size and configuration for 3-nodes cluster.

In my backup script I would like to write a snapshot for a 3-nodes cluster on a shared NFS. My .ini file is like this:
...
[Mapping]
v_vmart_node0001 = host1:/home/dbadmin/backups
v_vmart_node0002 = host2:/home/dbadmin/backups
v_vmart_node0003 = host3:/home/dbadmin/backups

Where /home/dbadmin/backups is on a NFS disk, shared between host1 , host2 and host3 (so it would be the same if I had localhost:/home/dbadmin/backups)

Now, when I look at the data written on the backupDir I don't understand how data is written.
My database data dir is 100Gb and the backupDir has 3 subdirectories (1 for each node, that's correct) 100Gb each. So it's like every node makes its independent snapshot.
I don't use hardlinks and space on disk is evaluated with linux command "du -sh *" on backupdir which doesn't follow any symlink.

I would expect to have more than 100Gb and less than 300Gb in total for a full snapshot of every node: depending on how projections are segmented and replicated. If Every projection was unsegmented and replicated on all nodes the total backup size should be 300Gb (but this is not the case).
Am I wrong to repeat every node? (Is it enough to use mapping for a node only?)

After this total size doubt,
I also don't understand if backups are correctly done incrementally :
-every node backupDir has 1 or more *_archive dirs depending on restorePointLimit (2 in my case)
-when running following snapshots with the same .ini the new snapshots' size is not just the new data's : the snapshots are always bigger than the older ones (like if the whole data had changed) and I can't figure out why. If archive n°1 is 100Gb newer snapshots are >=100Gb, I would expect they were 1-10Gb if data had changed 1-10%.
Can you give me any help?
Pietro

We're Moving!

Create My New Community Account Now

backup multi-node cluster management

Comments

Leave a Comment