Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Vertica is not starting up due to ulimit being insufficient

karthikkarthik
edited November 3 in General Discussion

$ admintools -t start_db -d my_database -U
Info: no password specified, using none
Could not login with SSH to host 10.3.2.110

    Starting nodes: 
            v_my_database_node0001 (10.3.2.110) ***UNAVAILABLE***
            v_my_database_node0002 (10.3.2.111)
            v_my_database_node0003 (10.3.2.112)
    Vertica names unique storage locations using a random algorithm. The operating system improves the randomness of its pseudo-random-number generator by gathering 'entropy' from the environment.  Either some nodes currently have insufficient entropy to provide sufficient randomness (in which case, the problem is solved by waiting for sufficient entropy to accumulate, a process which should not take much more than a minute) or some error has prevented reading the entropy value from the node(s) (if this process doesn't make progress in a few minutes --- perhaps longer on larger clusters --- contact Vertica technical support).
    Starting Vertica on all nodes. Please wait, databases with large catalog may take a while to initialize.
    10.3.2.111 failed. Result:

status=1 host=10.3.2.111 content={u'command_list': [u'/opt/vertica/bin/vertica', u'-D', u'/data/my_database/v_my_database_node0002_catalog', u'-C', u'my_database', u'-n', u'v_my_database_node0002', u'-h', u'10.3.2.111', u'-p', u'5433', u'-P', u'4803', u'-Y', u'ipv4', u'-U'], u'db_log_path': u'/data/my_database/dbLog', u'start_stdout': None, u'special_environment': None, u'return_code': 1, u'start_stderr': None}
Check /data/my_database/dbLog for more information.
Press RETURN to continue
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Node Status: v_my_database_node0001: (DOWN) v_my_database_node0002: (DOWN) v_my_database_node0003: (DOWN)
Error starting database, no nodes are up
Press RETURN to continue
Database my_database did not start successfully

We saw the below in startup.log

v_my_database_node0003_catalog]$ tail -f startup.log
{
"node" : "v_my_database_node0003",
"stage" : "Database Halted",
"text" : "Not enough open file handles allowed (1024 available/32768 required); see 'ulimit -n'.\n",
"timestamp" : "2020-10-28 10:43:03.353"
}

We did the below in /etc/security/limits.conf (followed by reboot)

  • soft nofile 191918
  • hard nofile 191918
    We still saw the same error
    When we get a fresh login using su -l -s /bin/bash, then we were able to see ulimit -n showing 191918 for root user.
    Can somebody point us where is the problem here ? A reboot didn't help but a new session using su -l -s /bin/bash helps always. We are wondering what should be done to permanently get this fixed. Each time we shutdown vertica, we are not able to bring it back up without getting a new session using su -l -s /bin/bash

Answers

  • Hi Jim,
    We set the hard limit and soft limit explicitly for the amount of RAM available in the system (191918MB) and performed reboot of the boxes. The problem is that without getting a new session using su -l -s /bin/bash we are not able to see the ulimit getting reflected. So we need to figure out why is this so and this is where we need some ideas.
    Besides, we've noticed that ulimit has always been 191918 for dbadmin user but it was 1024 for root user. Is it required for root user as well to set the ulimit to higher number?
    Thanks,
    Karthik

  • Jim_KnicelyJim_Knicely Administrator

    The setting is for the dbadmin user as it's the one that owns the Vertica processes.
    Just noticed the error at the beginning of your original post:

    Info: no password specified, using none
    Could not login with SSH to host 10.3.2.110

    What node (IP) were you on when you ran the start_db command? ANd what user were you?

  • It was the dbadmin user and I was on node1. Note that after doing su -l -s /bin/bash and switching to dbadmin user this works

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.