Vertica 6.1.2 in AWS Cluster getting error when creating database in cluster - getLocalNodeInternal

I am doing a POC for a client who is running Vertica 6.1.2 in AWS

So I have 2 node cluster and I was able to install the software with cluster option with no problems..
When I started to create the database, the will get created on the first node but the 2nd node keep showing down and in the catalog directory under vertica.log I see following message which keeps repeating...

nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name2014-09-10 16:40:48.001 nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:49.002 nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:49.004 Cluster Inviter:0x7fefe800f310 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:49.004 Cluster Inviter:0x7fefe800f310 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:50.001 nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:51.002 nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:51.006 Cluster Inviter:0x7fefe800e700 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:51.006 Cluster Inviter:0x7fefe800e700 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:52.002 nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:53.002 Cluster Inviter:0x7fefe800e700 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:53.002 Cluster Inviter:0x7fefe800e700 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:53.003 nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:54.000 Timer Service:0x6422050 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2014-09-10 16:40:54.000 Timer Service:0x6422050 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name
2014-09-10 16:40:54.000 Timer Service:0x6422050 [Util] <INFO> Task 'LicenseSizeAuditor' enabled
2014-09-10 16:40:54.001 nameless:0x641aeb0 [Catalog] <WARNING> Catalog::getLocalNodeInternal empty node name

Anyone has encountered this or have any clue...

Firewall is off...SELINUX is disabled...
ssh between nodes is all OK.

Comments

  • Ensure that all the ports needed for vertica are open. The ports that vertica need are

    Vertica
    5433 TCP (All connections)

    Spread
    4803 TCP (Client connections)
    4803 UDP (Daemon <-> Daemon)
    4804 UDP (Daemon <-> Daemon)



  • Hi Eugenia,
    I checked the ports and they are available on both nodes..

    Node1
    tcp        0      0 0.0.0.0:5433                0.0.0.0:*                   LISTEN      tcp        0      0 172.31.27.164:5433          172.31.27.164:57390         ESTABLISHED 
    tcp        0      0 172.31.27.164:57390         172.31.27.164:5433          ESTABLISHED 
    tcp        0      0 :::5433                     :::*                        LISTEN      
    udp        0      0 0.0.0.0:5433                0.0.0.0:*                               
    testclust [root@vertn1 ~]# netstat -an | grep 480
    tcp        0      0 0.0.0.0:54804               0.0.0.0:*                   LISTEN      
    tcp        0      0 127.0.0.1:4803              0.0.0.0:*                   LISTEN      
    tcp        0      0 172.31.27.164:4803          0.0.0.0:*                   LISTEN      
    udp        0      0 127.0.0.1:4803              0.0.0.0:*                               
    udp        0      0 172.31.27.164:4803          0.0.0.0:*                               
    udp        0      0 172.31.31.255:4803          0.0.0.0:*                               
    udp        0      0 127.0.0.1:4804              0.0.0.0:*                               
    udp        0      0 172.31.27.164:4804          0.0.0.0:*                               
    unix  2      [ ACC ]     STREAM     LISTENING     32701  /tmp/4803
    unix  3      [ ]         STREAM     CONNECTED     35357  /tmp/4803
    testclust [root@vertn1 ~]# 

    Node 2
    tcp        0      0 0.0.0.0:5433                0.0.0.0:*                   LISTEN      tcp        0      0 172.31.27.164:5433          172.31.27.164:57390         ESTABLISHED 
    tcp        0      0 172.31.27.164:57390         172.31.27.164:5433          ESTABLISHED 
    tcp        0      0 :::5433                     :::*                        LISTEN      
    udp        0      0 0.0.0.0:5433                0.0.0.0:*                               
    testclust [root@vertn1 ~]# netstat -an | grep 480
    tcp        0      0 0.0.0.0:54804               0.0.0.0:*                   LISTEN      
    tcp        0      0 127.0.0.1:4803              0.0.0.0:*                   LISTEN      
    tcp        0      0 172.31.27.164:4803          0.0.0.0:*                   LISTEN      
    udp        0      0 127.0.0.1:4803              0.0.0.0:*                               
    udp        0      0 172.31.27.164:4803          0.0.0.0:*                               
    udp        0      0 172.31.31.255:4803          0.0.0.0:*                               
    udp        0      0 127.0.0.1:4804              0.0.0.0:*                               
    udp        0      0 172.31.27.164:4804          0.0.0.0:*                               
    unix  2      [ ACC ]     STREAM     LISTENING     32701  /tmp/4803
    unix  3      [ ]         STREAM     CONNECTED     35357  /tmp/4803

    Anything else could be the issue...

    Also I have installed with node name and then also with ip address...same issue.
  • This error normally is an spread issues. This are step to debug spread. You did step 3, so please check the other steps.. the 4 will tell you if the spread is correctly working :
    1- If you are in amazon EC2 you have to be sure that when install vertica you had used -N -T option.

    2- Check the /opt/vertica/config/vspread.conf files on all nodes – they need to be identical.

    If it was installed with option -N -T all the nodes should have their own segment like this :

    Spread_Segment 10.104.33.179:4803 {
      N010104033179    10.104.33.179 {
        10.104.33.179
        127.0.0.1
      }
    }
    Spread_Segment 10.104.87.247:4803 {
      N010104087247    10.104.87.247 {
        10.104.87.247
        127.0.0.1
      }
    }

    It will be incorrect to have the vspread.conf file with just one segment like this :

    Spread_Segment 10.104.33.179:4803 {
      N010010001023    10.10.1.23 {
        10.104.33.179
        127.0.0.1
      }
      N010010001024    10.104.87.247 {
        10.104.87.247
        127.0.0.1
      }
    }

    3- If the vspread configuration is fine.Verify that the nodes ports are correctly open. Spread need port 4803 and 4804

    You can verify doing a nststat test, for example

    # sudo netstat -uatp | grep 480
    tcp 0 0 localhost.localdomain:4803 *:* LISTEN 2374/spread
    tcp 0 0 priv-mymachine1.vertica:4803 *:* LISTEN 2374/spread
    tcp 0 0 localhost.localdomain:smtp *:* LISTEN 4800/sendmail: acce
    udp 0 0 localhost.localdomain:4803 *:* 2374/spread
    udp 0 0 priv-mymachine.vertica:4803 *:* 2374/spread
    udp 0 0 192.168.xxx.xxx:4803 *:* 2374/spread
    udp 0 0 localhost.localdomain:4804 *:* 2374/spread
    udp 0 0 priv-mymachine1.verti:4804 *:* 2374/spread

     

    4- Do spread test with spuser to see if nodes at least are communicating between them.

    You can do spuser -r test like this
    /opt/vertica/spread/bin/spuser -r
    At the prompt User >
    enter in “j test”
    j is for join and you are creating a grouping called test.

    Repeat this on all nodes to confirm that they are connecting to each other. You should see the new connections come in with each node connecting.

  • Hi Eugenia,
        You were right it was the spread issue.  I uninstalled vertica and reinstalled it with -N and -T option and then tested spread.  I did not knew about -N and -T has to be used in AWS.

        Is -N and -T also required for all later version 7.x and 7.1.x.

        Thank you again, I really appreciate your help.
  • Yes, the -T is to do message point to point and -N different subnets. That did not change newer versions. 
    Glad that you resolve your issue.
    Eugenia

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file