replication - Approximate bytes to copy
Hi,
I have a process that replicates data periodically from one standalone (1 node) server to other. In theory it should start every 15 min.
There are 6 tables to be replicated. their size being 150 GB.
Between 2 replication there aren't many changes. I would appreciate, less than 1 GB.
My problem is that vbr approximates and replicates 35 GB.
Does someone know what files are copied ?
How to decrease these 'replication size' ? Maybe, increasing the number of ROS/files associated to these tables ?
I though of summing size of all files changed after the last replication:
select schema_name,projection_name ,sum(used_bytes)/1024/1024/1024 storage_gb
from v_monitor.storage_containers c, remote_replication_status r
where storage_type='ROS'
and
( (schema_name='s1' and projection_name like 't1_super')
or (schema_name='s2' and projection_name like 't2_super')
)
and c.end_epoch > r.replicated_epoch
group by schema_name, projection_name;
As I said, this sum is 0,1 Gb.
So, why vertica server copies 35 GB ?
Thank you,
Veronica
Comments
Its possible you are loading into inactive partitions and the TM is merging files.
Backup occurs at a file level and if the checksum of the files does not match, rsync has to copy them over anyways to ensure a consistent snapshot.
I can share some queries i wrote a while ago to find out what files are getting moved and how much space is needed.
you may not be able to use them as-is, buts its a good starting point
Create a schema
Load some backup file metadata into the tables
Generic space usage
space common to all snapshots
Thank you , skeswani !
I will check all stmt-s.
Veronica