What's wrong with our memory settings?
Hi,
I'm new to Vertica and I'm overwhelmed by the amount of information on memory management & configuration in this system. Our 3-node cluster is unstable mainly due to memory utilization on the main node we tend to hit with queries. Once Vertica reports that memory utilization on this node goes above 95% there is a high chance it will stop responding and we have to restart the service on it.
Its memory utilization profile of that node (called "node-1") is vastly different from other 2 nodes (node-2 and node-2).
Here is our cluster just before node-1 went down around 7:24am:
Here is the internal memory monitor on that node (you can also see the restart cleared it up but it is starting to creep up again):
Other 2 nodes look both like this:
Our resource pool config is attached (CSV format).
The hardware systems that are running on AWS are c3.4xlarge (16 vCPU / 30 Gb RAM).
Does not look like I can give Vertica processes much more memory but if you think I should consider upgrading to 64+ Gb system, I can probably do that.
I would love to get some general direction from you folks -- I'm not even sure where to start.
BIG Thanks!
Comments
Hi,
You said that "The hardware systems that are running on AWS are c3.4xlarge (16 vCPU / 30 Gb RAM)."
For maximum performance, Vertica nodes should include at least 256 GB of RAM. The good rule of thumb is to have 8–12 GBs of RAM per physical core in the server.
See:
https://my.vertica.com/kb/GenericHWGuide/Content/Hardware/GenericHWGuide.htm
According to the info here (https://aws.amazon.com/ec2/virtualcores/) an EC2 c3.4xlarge Instance Type a "Virtual Core Count" of 8. So you would want to have at least 64-96 of RAM.
Also, look into "Connection Load Balancing". It can help you avoid overloading your node-1.
See:
https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/AdministratorsGuide/LoadBalancing/ConnectionLoadBalancing.htm
Thank you Jim!