We have cluster of node 34, we found cpu utilization is high on one node. i am not able to find the exact reason why cpu utilization was high
please suggest, if any one have the anwer
The diagnosis and fix may take a few attempts, and why don't you start with our validation scripts? Here are two links, an overview and one for the CPU issue you're having.
In addition to Tom's suggestions, maybe check that you are not overloading that node with a disproportional number client sessions:
select node_name, count(*) from user_sessions group by node_name order by 2 desc;
select node_name, count(*) from user_sessions where session_start_timestamp >= 'DATE1' and session_end_timestamp <= 'DATE2' group by node_name order by 2 desc;
Replace DATE1 and DATE2 with a date range where you saw the spike.
Thanks Guru. let me check.
I did not found high number of client sessions on problematic node also i run vcpuperf command on two node. below is the output
Compiled with: 4.8.2 20140120 (Red Hat 4.8.2-15)
Expected time on Core 2, 2.53GHz: ~9.5s
Expected time on Nehalem, 2.67GHz: ~9.0s
Expected time on Xeon 5670, 2.93GHz: ~8.0s
This machine's time:
CPU Time: 15.030000s
Some machines automatically throttle the CPU to save power.
This test can be done in <100 microseconds (60-70 on Xeon 5670, 2.93GHz).
Low load times much larger than 100-200us or much larger than the corresponding high load time
indicate low-load throttling, which can adversely affect small query / concurrent performance.
This machine's high load time: 130 microseconds.
This machine's low load time: 327 microseconds.
This machine's time:
CPU Time: 11.470000s
This machine's high load time: 65 microseconds.
This machine's low load time: 132 microseconds.
please check above output and suggest if there is any problem also suggest which point i have to check on vcpuperf command output.
Do each of the nodes have the same processors? Check with this SQL:
select host_name, processor_count, processor_core_count, processor_description from host_resources order by 1;
Are there any other processes (besides Vertica) running on the node experiencing the issue? The following Linux command should list the top 10 processes by CPU usage:
ps -Ao user,uid,comm,pid,pcpu,tty --sort=-pcpu | head -n 10
From your vcpuperf output, it looks like CPU scaling might be enabled. You should disable it. See:
Also, check the following post thread to see if it can help you: https://forum.vertica.com/discussion/238751/high-cpu-usage
Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.