glibc memory bloat

It appears that the memory of my vertica process increases over time and the process eventually crashes with an OOM (out-of-memory) kill. Can you help me

Answers

  • skeswaniskeswani Employee
    edited October 2019

    glibc has a performance enhancement that result in glibc holding on to memory that application is not using and/or does not need.
    Over time this additional memory may result in the kernel killing the application (i.e. vertica) with a OOM kill.
    This is more pronounced with multi-threaded application and can result in frequent OOM (out-of-memory) crashes
    see
    https://stackoverflow.com/questions/48651432/glibc-application-holding-onto-unused-memory-until-just-before-exit
    https://sourceware.org/bugzilla/show_bug.cgi?id=11261

    This is not considered a glibc bug, since there are tunables that allow the user to control this behavior.
    There are a few things a vertica admin can do to prevent OOMs

    Use the tunable that glibc provides to control memory growth

    in ~/.bashrc on ALL nodes add the following line

    $ export MALLOC_ARENA_MAX=4

    log out and log back in then restart vertica database
    OR if you are unable to edit ~/.bashrc, always start the database using admintools with the VERTICA_ADMINTOOLS_PASSTHROUGH variable set as follows

    $ export VERTICA_ADMINTOOLS_PASSTHROUGH=MALLOC_ARENA_MAX=4 admintools -t start_db -d mydb

    you can check the environment variables are set correctly using the following commands
    To check if the environment variable is set correctly

    $ set | grep ARENA
    $ MALLOC_ARENA_MAX=4

    To ensure that the environment has been applied to the process correctly (check all nodes)

    $ xargs --null --max-args=1 echo < /proc/$(pgrep vertica$)/environ | grep ARENA
    $ MALLOC_ARENA_MAX=4 <== Verified

    Run a UDx (User Defined Extension) that trims the memory periodically

    With g++ version 4.8 or less, compile the following User Defined Extenstion

    /usr/bin/g++ -D HAVE_LONG_INT_64 -I /opt/vertica/sdk/include -Wall -shared -Wno-unused-value -fPIC -o /tmp/mtUDX.so /tmp/udx_malloc_trim.cpp /opt/vertica/sdk/include/Vertica.cpp

    ( attached : udx_malloc_trim.txt, since cpp files are sometimes blocked. rename this file to udx_malloc_trim.cpp before you compile it)

    After you have compiled the extension, install it as shown below

    $ vsql
    DROP LIBRARY if exists MallocTrim CASCADE;
    NOTICE 4185: Nothing was dropped
    DROP LIBRARY
    create library MallocTrim as '/tmp/mtUDX.so';
    CREATE LIBRARY
    CREATE transform FUNCTION mallocTrim AS LANGUAGE 'C++' NAME 'MallocTrimFactory' LIBRARY MallocTrim NOT FENCED;
    CREATE TRANSFORM FUNCTION

    Connect vsql to THE NODE where Vertica RSS memory is large and run MallocTrim(1). This will trim the vertica memory usage and prevent an OOM

    $vsql -h $vertica_host_ip
    skeswani=> select MallocTrim(1) over ();
    host | what | when | value
    -----------------+------+--------+-----------
    skeswani-laptop | RSS | before | 237502464
    skeswani-laptop | RSS | after | 228950016
    (2 rows)

    Here is simple shell script to check for memory use and run MallocTrim when the memory usage exceeds 11GB
    Although, I use 11 GB in this example, you should use 85% of memory on the machine as a trigger to run the trim.
    (i.e. if your hosts has 100GB RAM, then you should call MallocTrim when the Vertica RSS size exceeds 85GB)

    $ pages=$(cut -f2 -d ' ' /proc/$(pgrep vertica$)/statm)
    $ let mem="pages * 4096"
    $ if [ "$mem" -gt "11811160064" ]; then vsql -h $vertica_host_ip -a -c "select MallocTrim(1) over ();"; fi

    Note: This only applies to vertica version 9.1 and less. Starting with vertica version 9.2 , this is automatically done by the database and no admin intervention is needed

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file