What's wrong with our memory settings?
I'm new to Vertica and I'm overwhelmed by the amount of information on memory management & configuration in this system. Our 3-node cluster is unstable mainly due to memory utilization on the main node we tend to hit with queries. Once Vertica reports that memory utilization on this node goes above 95% there is a high chance it will stop responding and we have to restart the service on it.
Its memory utilization profile of that node (called "node-1") is vastly different from other 2 nodes (node-2 and node-2).
Here is our cluster just before node-1 went down around 7:24am:
Here is the internal memory monitor on that node (you can also see the restart cleared it up but it is starting to creep up again):
Other 2 nodes look both like this:
Our resource pool config is attached (CSV format).
The hardware systems that are running on AWS are c3.4xlarge (16 vCPU / 30 Gb RAM).
Does not look like I can give Vertica processes much more memory but if you think I should consider upgrading to 64+ Gb system, I can probably do that.
I would love to get some general direction from you folks -- I'm not even sure where to start.