High CPU on new node
Recently we added a 4th node to our 3-node cluster (Vertica 7.2.0-1 on AWS r3.2xlarge instances).
The process of adding the node was pretty painless. The only issue we're having is the fact that the new node seems to use twice as much CPU as the other three nodes:
select node_name, trunc(AVG(average_cpu_usage_percent), 1) from cpu_usage WHERE start_time BETWEEN NOW() - INTERVAL '15 minutes' AND NOW() group by node_name;
At first we suspected the rebalancing process was still running, but 'select * from system_sessions where session_type = 'REBALANCE_CLUSTER' and is_active = true;' returns zero results.
Checking the logs on Node 4 does show a 'rebalance_cluster(background)' transaction that keeps starting and rolling back immediately every 5 minutes:
2017-02-07 09:42:14.010 RebalanceCluster:0x7f106c014ee0-d000000010d377 [Txn] <INFO> Begin Txn: d000000010d377 'rebalance_cluster(background)' 2017-02-07 09:42:14.010 RebalanceCluster:0x7f106c014ee0-d000000010d377 [Txn] <INFO> Rollback Txn: d000000010d377 'rebalance_cluster(background)' 2017-02-07 09:42:14.017 RebalanceCluster:0x7f106c014ee0 [Util] <INFO> Task 'RebalanceCluster' enabled
The same is not found in the logs on the other nodes.
One of the few pages in the documentation that address high CPU usage is: https://my.vertica.com/docs/7.2.x/HTML/Content/Authoring/AdministratorsGuide/Monitoring/Vertica/MonitoringLinuxResourceUsage.htm , which suggests setting the swappiness parameter to 0. Changing this parameter did not have any impact.
I hope someone can point me in the right direction, trying to find a cause for the CPU usage on the new node.