Performance issue: high load, low cpu
We sometimes get into a situation where one of the nodes experiences high average load (>30) while the other nodes are just fine, and the CPU then drops to almost 0%. This causes huge increases in our app latency and CPU on all nodes drop to almost nothing.
We are running each node with 60GB memory and 16 vcpus with Vertica 9.0.1-3 on CentOS 7 (4.15.6-1.el7.elrepo.x86_64).
The system is not swapping, there is ~30GB of free memory. Disk is active but not highly loaded (although there is much more read activity than on the other nodes). Network packet in/out and KB in/out seem about normal.
Any ideas? What should I be looking at?