Vertica 8.1.1 OOM (constantly getting killed)

garthg · December 2017

We have been plagued with OOM issues. Today we had two nodes fail within 15 minutes of each other. Every month we have at least a node fail due to OOM issues. Here are the details:

DB Version: 8.1.1-6 (3 nodes)
AWS AMI: Vertica 8.1.1 CentOS 7.3 - 1498566984-38d06046-9fbd-4e9e-8f59-cfdb7b6de752-ami-751f2e63.4 (ami-85ffe3fc)
OS: Centos 7.3 3.10.0-514.6.2.el7.x86_64
gLibc: glibc-2.17-196.el7.x86_64

RAM: total used free shared buff/cache available
Mem: 62G 4.4G 40G 491M 17G 57G
Swap: 15G 727M 15G

Resource pools:

w/memorysize set:
sysquery: 64M, sysdata: 100M, wosdata: 2G, tm: 2G, p_dashboard (custom pool): 8G (cascades to general)

w/maxmemorysize set:
general: 48G, sysdata: 1GB, wosdata: 2G, jvm: 2GB, monitoring: 2GB, blobdata: 10% (not used, we don't run any machine learning).

OOM dmesg logs:
[ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[331310] 1001 331310 25760521 15801473 42609 3996082 0 vertica

Shows vertica with 60.2GB rss, and 15GB swapents; no other process has even close to 1GB rss.

sysctl: (changes to base AMI:) vm.swappiness=1

Any help would be greatly appreciated.

Jim_Knicely · December 2017

Hi,

Did you see the posts about the recommendation to update to Vertica 8.1.1.9 if using 8.1.1.6?

https://forum.vertica.com/discussion/239197/vertica-8-1-1-6

garthg · January 2018

Thanks, we switched to Vertica 8.1.1-10, but promptly got OOM errors a few days later. We've since lowered the general pool max memory size. However, we had some other issues with 8.1.1-10 that caused us to have to switch versions again (to 9.0.0-2). The 8.1.1-10 system would constantly segfault on startup while performing an analyze row task and we couldn't keep the database up.

See: https://forum.vertica.com/discussion/239313/8-1-1-10-segfault-on-startup-after-crash/

Ben_Vandiver · January 2018

What's the size of the metadata pool? (Catalog size)
Anything particularly interesting in your workload? UDx?

We're Moving!

Create My New Community Account Now

Vertica 8.1.1 OOM (constantly getting killed)

Comments

Leave a Comment