replication - Approximate bytes to copy
Hi,
I have a process that replicates data periodically from one standalone (1 node) server to other. In theory it should start every 15 min.
There are 6 tables to be replicated. their size being 150 GB.
Between 2 replication there aren't many changes. I would appreciate, less than 1 GB.
My problem is that vbr approximates and replicates 35 GB.
Does someone know what files are copied ?
How to decrease these 'replication size' ? Maybe, increasing the number of ROS/files associated to these tables ?
I though of summing size of all files changed after the last replication:
select schema_name,projection_name ,sum(used_bytes)/1024/1024/1024 storage_gb
from v_monitor.storage_containers c, remote_replication_status r
where storage_type='ROS'
and
( (schema_name='s1' and projection_name like 't1_super')
or (schema_name='s2' and projection_name like 't2_super')
)
and c.end_epoch > r.replicated_epoch
group by schema_name, projection_name;
As I said, this sum is 0,1 Gb.
So, why vertica server copies 35 GB ?
Thank you,
Veronica
Leave a Comment
Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.
Comments
Its possible you are loading into inactive partitions and the TM is merging files.
Backup occurs at a file level and if the checksum of the files does not match, rsync has to copy them over anyways to ensure a consistent snapshot.
I can share some queries i wrote a while ago to find out what files are getting moved and how much space is needed.
you may not be able to use them as-is, buts its a good starting point
Create a schema
Load some backup file metadata into the tables
Generic space usage
space common to all snapshots
Thank you , skeswani !
I will check all stmt-s.
Veronica