Options

Vertica 7.2.2 - Copycluster got empty tables on destination

stestazstestaz Community Edition User

Hi All,
I'm running vbr with copycluster option in order to clone my production cluter to staging cluster, each cluster has got 1 node and the vertica version is the same.
The vbr process runs correctly and no error are promped but on destination some tables are empty, I've tried to drop the empy tables on destination but the result is the same, I've tried to delete the destination db too but with no luck, this is my ini file

[Misc]
snapshotName = dwh_snapshot
dest_verticaBinDir = /opt/vertica/bin
restorePointLimit = 3
objectRestoreMode = createOrReplace

[Database]
dbName = dwh
dbUser = dbadmin
dbPromptForPassword = True

[Transmission]
port_rsync = 50000
[Mapping]
v_dwh_node0001 = xxx.xxx.xxx.245:

Any suggestions?

Answers

  • Options
    Bryan_HBryan_H Vertica Employee Administrator

    Trying running with option "--debug 3" and check logs for any additional messages that might show why some tables are not copied. Also check the temp dir (default /tmp/vbr) to see if there are additional log files there. You may need to set temp dir explicitly in the [Misc] section to ensure proper permissions and enough free space to copy objects:
    [Misc]
    tempDir=/path/to/vbrTemp

  • Options
    stestazstestaz Community Edition User

    @Bryan_H said:
    Trying running with option "--debug 3" and check logs for any additional messages that might show why some tables are not copied. Also check the temp dir (default /tmp/vbr) to see if there are additional log files there. You may need to set temp dir explicitly in the [Misc] section to ensure proper permissions and enough free space to copy objects:
    [Misc]
    tempDir=/path/to/vbrTemp

    I've tried as you suggested but with no luck, there are no errors in the log file and the table is still empty

  • Options
    Jim_KnicelyJim_Knicely - Select Field - Administrator

    @stestaz - Any info in the Vertica log on the desitnation node?

  • Options
    stestazstestaz Community Edition User

    This is what I found on the vertica.log
    2020-01-03 11:21:18.678 Init Session:0x7f1cac011dd0-a00000000120a5 [Txn] Begin Txn: a00000000120a5 'Moveout: (Table: dmtrt.org_d_uo) (Projection: dmtrt.org_d_uo_v1_super)'
    2020-01-03 11:21:18.679 Init Session:0x7f1cac011dd0-a00000000120a5 [EE] (a00000000120a5) TM Moveout: WOS is empty at epoch 16007933; nothing to do
    2020-01-03 11:21:18.679 Init Session:0x7f1cac011dd0-a00000000120a5 [Txn] Starting Commit: Txn: a00000000120a5 'Moveout: (Table: dmtrt.org_d_uo) (Projection: dmtrt.org_d_uo_v1_super)'

  • Options
    stestazstestaz Community Edition User

    @Jim_Knicely said:
    @stestaz - Any info in the Vertica log on the desitnation node?

    This is what I found on the vertica.log
    2020-01-03 11:21:18.678 Init Session:0x7f1cac011dd0-a00000000120a5 [Txn] Begin Txn: a00000000120a5 'Moveout: (Table: dmtrt.org_d_uo) (Projection: dmtrt.org_d_uo_v1_super)'
    2020-01-03 11:21:18.679 Init Session:0x7f1cac011dd0-a00000000120a5 [EE] (a00000000120a5) TM Moveout: WOS is empty at epoch 16007933; nothing to do
    2020-01-03 11:21:18.679 Init Session:0x7f1cac011dd0-a00000000120a5 [Txn] Starting Commit: Txn: a00000000120a5 'Moveout: (Table: dmtrt.org_d_uo) (Projection: dmtrt.org_d_uo_v1_super)'

  • Options
    Bryan_HBryan_H Vertica Employee Administrator

    Is "Table: dmtrt.org_d_uo" one of the tables that was not copied? If so, were there any other messages related? It might help to grep the name of a table that is not copied to determine whether there were any messages at all related to the table or its projections.
    Also, it would help to see the vbr logs for the copy at debug level 3, even if they appeared successful, in case there were messages there that might help us understand what is happening.

  • Options
    stestazstestaz Community Edition User

    @Bryan_H said:
    Is "Table: dmtrt.org_d_uo" one of the tables that was not copied? If so, were there any other messages related? It might help to grep the name of a table that is not copied to determine whether there were any messages at all related to the table or its projections.
    Also, it would help to see the vbr logs for the copy at debug level 3, even if they appeared successful, in case there were messages there that might help us understand what is happening.

    Hi Bryan, yes dmtrt.org_d_uo is one of the tables that was not copied.
    In the Source machine I've a log file from vbr that I attached but I cannot find nothing interesting but I'm not so experienced.
    there are some logs in folder /tmp/vbr_rsyncd can tehy be usefull?

  • Options
    Bryan_HBryan_H Vertica Employee Administrator

    I see nothing unexpected in the vbr log. The log files in /tmp/vbr_rsyncd may be useful since they may show an issue copying specific files.
    How many tables are missing data? I wonder if there is some other issue I'm not thinking of.

  • Options
    stestazstestaz Community Edition User

    @Bryan_H said:
    I see nothing unexpected in the vbr log. The log files in /tmp/vbr_rsyncd may be useful since they may show an issue copying specific files.
    How many tables are missing data? I wonder if there is some other issue I'm not thinking of.

    I've tried to compare tables by dimension whit this query

    select schema_name as table_schema,
    anchor_table_name as table_name,
    round(sum(used_bytes)/(1024^2), 2) as used_mb
    from v_monitor.storage_containers sc
    join v_catalog.projections p
    on sc.projection_id = p.projection_id
    group by table_schema,
    table_name
    order by used_mb desc;
    and there is a difference in nearly all tables, it look likes if the restore is a partial restore or if it is not a full restore, i'm going crazy...

  • Options
    stestazstestaz Community Edition User

    I've found another strange thing in the source host, the epoch if I use vbr --task listbackup is stuck at 15626842 can the 2 things be connected?

  • Options
    Bryan_HBryan_H Vertica Employee Administrator

    That's a bit of a concern, and would explain why there is missing data: only data up to that epoch is backed up.
    Let's ccheck whether epoch is actually advancing in Vertica. The following query will show Last Good Epoch and Ancient History Mark:
    SELECT /+label(LGEAHMlagging)/ current_epoch,Case WHEN (last_good_epoch = -1 ) THEN current_epoch -1 ELSE last_good_epoch END lge_epoch,ahm_epoch,
    Case WHEN (last_good_epoch = -1 ) THEN 1 ELSE (current_epoch - last_good_epoch) END lge_lag, current_epoch - ahm_epoch as ahm_lag
    FROM system;

  • Options
    stestazstestaz Community Edition User

    @Bryan_H said:
    That's a bit of a concern, and would explain why there is missing data: only data up to that epoch is backed up.
    Let's ccheck whether epoch is actually advancing in Vertica. The following query will show Last Good Epoch and Ancient History Mark:
    SELECT /+label(LGEAHMlagging)/ current_epoch,Case WHEN (last_good_epoch = -1 ) THEN current_epoch -1 ELSE last_good_epoch END lge_epoch,ahm_epoch,
    Case WHEN (last_good_epoch = -1 ) THEN 1 ELSE (current_epoch - last_good_epoch) END lge_lag, current_epoch - ahm_epoch as ahm_lag
    FROM system;

    Hi Bryan, unlikely my cluster had got a problem and I had to restore from the old backup (loosing all data after the backup epoch), now I think the epoch is not stuck at all and this is the result of the query

    I think now the problem with sincronization and backup is solved but many data has been lost

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file