Options

Data Collector not collecting for 1 of 4 nodes

bmurrellbmurrell Community Edition User

Running v10.1.

The following SQL shows that node 3 is not collecting : -
SELECT node_name, first_time, last_time, memory_buffer_size_kb, disk_size_kb
FROM v_monitor.data_collector
WHERE table_name = 'dc_cpu_aggregate_by_minute'
ORDER BY table_name;

Node First Time Last Time Mem Disk
v_****node0004 2022-02-28 16:34:00 2022-03-07 15:04:00 64 25600
v
****node0003 2022-02-22 21:08:00 2022-02-28 14:49:00 64 25600
v
****node0002 2022-03-02 03:21:00 2022-03-07 15:04:00 64 25600
v
****_node0001 2022-03-02 02:08:00 2022-03-07 15:04:00 64 25600

I've tried to reset using : -
SELECT set_data_collector_policy('CpuAggregateByMinute',64,25600)

and changing to different values, and then back, but nothing helps.

Answers

  • Options
    SergeBSergeB - Select Field - Employee

    What is the current status of your 4 nodes?

  • Options
    bmurrellbmurrell Community Edition User

    @SergeB said:
    What is the current status of your 4 nodes?

    All up and running. Just noticed that our collection of DataCollector tables into a central source wasn't working for 1 node on this cluster.

  • Options
    SergeBSergeB - Select Field - Employee

    Is you issue with your collection process? "Just noticed that our collection of DataCollector tables into a central source". Can you share details?

  • Options
    bmurrellbmurrell Community Edition User

    @SergeB said:
    Is you issue with your collection process? "Just noticed that our collection of DataCollector tables into a central source". Can you share details?

    Can't be the collection process. I'm checking the table v_internal.dc_cpu_aggregate_by_minute locally, and it doesn't contain details for node3.

  • Options
    SruthiASruthiA Vertica Employee Administrator

    Is node3 participating in queries? The data collector logs are generated in /v__node0001_catalog/DataCollector folder. Could you please check node3 to see if it is generating logs for dc tables. Is the issue specific to cpu info tables or all the tables?

  • Options
    bmurrellbmurrell Community Edition User

    @SruthiA said:
    Is node3 participating in queries? The data collector logs are generated in /v__node0001_catalog/DataCollector folder. Could you please check node3 to see if it is generating logs for dc tables. Is the issue specific to cpu info tables or all the tables?

    Yes, node 3 is participating in queries. There are no logs for that dc table in that folder, although there is on the other 3 nodes.

  • Options
    SruthiASruthiA Vertica Employee Administrator

    for the cpu, io, network related tables, we get the information from OS level and populate these dc tables. By any chance on node3 are OS metrics not populating? Please try restarting node3 and see if fixes the issue.

  • Options
    bmurrellbmurrell Community Edition User

    Restarting the database (on all nodes) fixed the issue. However, it didn't cleanly shutdown on node 3 and had to be killed. I guess that was related to this problem.

  • Options
    SruthiASruthiA Vertica Employee Administrator

    @bmurrell : Thanks for the update. It looks like node3 was hung. When you experience such behavior in future, please open a support case so that we can find the exact root cause.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file