How do I monitor the copy cluster process?

We have scheduled the vbr copycluster using cron job. The job occasionally fails due to various reasons.
For replicating objects, I know that there is a table, v_monitor.REMOTE_REPLICATION_STATUS, that can be used for monitoring. Is there a similar table for copycluster monitoring?

Answers

  • SruthiASruthiA Vertica Employee Administrator

    You can monitor it using vbr.log. progress percentage will be written to log. As of now there are no tables to monitor the same.

  • Thanks SruthiA.
    Is it possible to view the logs via management console? Would it be possible to set some kind of alerts when error happens?

  • SruthiASruthiA Vertica Employee Administrator

    You cannot view them using management console. Logs are present at path you have set for tmpDir in .ini file. You can grep for 'error' word and create a cron job to set up alerts.

  • bmurrellbmurrell Community Edition User

    I have a related issue. We copycluster once a week to 'mop up' any differences not performed by a 'replicate'.
    However, this week the size and duration was 5x as big and I can't see why. Schema sizes before and after are similar and not anywhere near the volume reported.
    Any ideas would be grateful. We're running Vertica v12.
    Thanks

  • Bryan_HBryan_H Vertica Employee Administrator

    @bmurrell Can you give a little more detail on your environment? Is this EE or Eon mode, and if Eon mode, which is the storage vendor? In general, copy cluster takes a snapshot of the containers and copies anything new or changed compared to the manifest, so were there large numbers of adds or deletes, especially across different partitions or tables, that might appear in the total size in STORAGE_CONTAINERS system table?

  • bmurrellbmurrell Community Edition User

    Hi, it's EE.
    I take a daily sizing extract and can see nothing that grew over that 24 hour period.
    The vbr snapshot was equally as large. Is there a way to interrogate the vbr manifest to see which tables were included?

  • Bryan_HBryan_H Vertica Employee Administrator

    You could compare the manifest files across snapshots to see if file count or size has changed much day to day, for example:
    in my local backup snapshot there is a manifest file for my single node: full_backup_snapshot_20230830_080115/v_d2_node0001/full_backup_snapshot.manifest
    There are rows listing each object where the first field is the sal_storage_id:
    02835cb6555b4a63f48d9eb18dd96dae00a000000003ff6b, bundle, 45035996273704984, 1172, 0
    So you could load these into a temporary table to compare snapshots by sal_storage_id and also compare to current storage_containers record. However, you'll note that there is a manifest for each node for each snapshot, so you'd need to add node name and snapshot ID to the manifest to make it easier to compare.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file