We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


replication - Approximate bytes to copy — Vertica Forum

replication - Approximate bytes to copy

Hi,
I have a process that replicates data periodically from one standalone (1 node) server to other. In theory it should start every 15 min.
There are 6 tables to be replicated. their size being 150 GB.
Between 2 replication there aren't many changes. I would appreciate, less than 1 GB.
My problem is that vbr approximates and replicates 35 GB.

Does someone know what files are copied ?
How to decrease these 'replication size' ? Maybe, increasing the number of ROS/files associated to these tables ?

I though of summing size of all files changed after the last replication:

select schema_name,projection_name ,sum(used_bytes)/1024/1024/1024 storage_gb
from v_monitor.storage_containers c, remote_replication_status r
where storage_type='ROS'
and
( (schema_name='s1' and projection_name like 't1_super')
or (schema_name='s2' and projection_name like 't2_super')
)
and c.end_epoch > r.replicated_epoch
group by schema_name, projection_name;

As I said, this sum is 0,1 Gb.
So, why vertica server copies 35 GB ?

Thank you,
Veronica

Tagged:

Comments

  • skeswaniskeswani - Select Field - Employee
    edited August 2019

    Its possible you are loading into inactive partitions and the TM is merging files.
    Backup occurs at a file level and if the checksum of the files does not match, rsync has to copy them over anyways to ensure a consistent snapshot.

    I can share some queries i wrote a while ago to find out what files are getting moved and how much space is needed.

    you may not be able to use them as-is, buts its a good starting point

  • skeswaniskeswani - Select Field - Employee

    Create a schema

    -- actual disk usage
    create table files (sz int, file varchar(200), path varchar(2000), dt timestamp, storage_id varchar(200) default substr(file,0,49) ) order by storage_id segmented by hash (storage_id ) all nodes ksafe;

    -- snapshot_info
    create table snapshot_info ( manifest_file varchar(200), storage_id varchar(200), type varchar(50) , storage_location_id int, length int, digest varchar(50)) order by storage_id segmented by hash(storage_id) all nodes ksafe;

    -- backup_info
    create table backup_info ( manifest_file varchar(200), storage_id varchar(200), type varchar(50) , ref_count int, length int, digest varchar(50)) order by storage_id segmented by hash(storage_id) all nodes ksafe;

    Load some backup file metadata into the tables

    find /location -maxdepth 10 -printf "%s|%f|%p|%TY-%Tm-%Td %TT\n" | vsql -c "copy files (sz,file, path,dt) from stdin abort on error;"

    for file in $(find /location -name *.manifest); do cat $file | sed -ne "/[objects]/,$ p;" | grep -v "[objects]" | sed 's/, /|/g' | sed "s/^/${file//\//_}|/g" ; done | vsql -c "copy snapshot_info from stdin abort on error;"

    for file in $(find /location -name backup_manifest); do cat $file | sed -ne "/[Objects]/,$ p;" | grep -v "[Objects]" | sed 's/, /|/g' | sed "s/^/${file//\//_}|/g" ; done | vsql -c "copy backup_info from stdin abort on error;";

    Generic space usage

    -- space for backup
    skeswani=> select sum(length)//10^12 TB from backup_info ;

    TB

    28
    (1 row)

    -- space on disk (mostly accurate, exclused some dfs dirs)
    skeswani=> select sum(sz)//10^12 TB from files where substr( path, position (files.storage_id in path), length(path)) = file;

    TB

    28
    (1 row)

    space common to all snapshots

    -- space common to all snapshots
    skeswani=> select sum(length)//10^12 common_size_TB from backup_info where >ref_count > 1;

    common_size_TB

    19
    (1 row)
    -- space sitting in only one snapshot.
    skeswani=> select sum(length)//10^12 common_size_TB from backup_info where >ref_count = 1;

    common_size_TB

    9
    (1 row)

  • Thank you , skeswani !
    I will check all stmt-s.
    Veronica

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file