What's wrong with our memory settings?

edited July 2018 in General Discussion

Hi,

I'm new to Vertica and I'm overwhelmed by the amount of information on memory management & configuration in this system. Our 3-node cluster is unstable mainly due to memory utilization on the main node we tend to hit with queries. Once Vertica reports that memory utilization on this node goes above 95% there is a high chance it will stop responding and we have to restart the service on it.

Its memory utilization profile of that node (called "node-1") is vastly different from other 2 nodes (node-2 and node-2).

Here is our cluster just before node-1 went down around 7:24am:

Here is the internal memory monitor on that node (you can also see the restart cleared it up but it is starting to creep up again):

Other 2 nodes look both like this:

Our resource pool config is attached (CSV format).

The hardware systems that are running on AWS are c3.4xlarge (16 vCPU / 30 Gb RAM).
Does not look like I can give Vertica processes much more memory but if you think I should consider upgrading to 64+ Gb system, I can probably do that.

I would love to get some general direction from you folks -- I'm not even sure where to start.

BIG Thanks!

Comments

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file