We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


spread client Saw membership message 8192 on V: <database_name> — Vertica Forum

spread client Saw membership message 8192 on V: &lt;database_name&gt;

Hi,

In our production vertica cluster (3 nodes), one node has been crashing (not the same node) for 3 times this week

I didn't find no error in /var/log*.log files.

Our cluster is installed on AWS. It's version is : v7.1.1.

 

 

-----------------
2016-10-22 13:22:11.003 LGELaggingCheck:0x7f508c013b30-c0000000e27980 [Txn] <INFO> Begin Txn: c0000000e27980 'Recovery: Get last good epoch'
2016-10-22 13:22:11.007 LGELaggingCheck:0x7f508c013b30-c0000000e27980 [Txn] <INFO> Starting Commit: Txn: c0000000e27980 'Recovery: Get last good epoch'
2016-10-22 13:22:11.007 LGELaggingCheck:0x7f508c013b30 [Txn] <INFO> Commit Complete: Txn: c0000000e27980 at epoch 0x2590c6
2016-10-22 13:22:11.008 LGELaggingCheck:0x7f508c013b30-c0000000e27981 [Txn] <INFO> Begin Txn: c0000000e27981 'Check LGE'
2016-10-22 13:22:11.008 LGELaggingCheck:0x7f508c013b30-c0000000e27981 [Txn] <INFO> Rollback Txn: c0000000e27981 'Check LGE'
2016-10-22 13:22:11.008 LGELaggingCheck:0x7f508c013b30 [Util] <INFO> Task 'LGELaggingCheck' enabled
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> Saw membership message 8192 on V:xxxxx
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> Saw membership message 8192 on Vertica:all
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> Saw membership message 8192 on Vertica:join
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> Saw membership message 6144 on V:officedw
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> NETWORK change with 1 VS sets
2016-10-22 13:22:27.450 Spread Client:0x8d5a1a0 [Comms] <INFO> DB Group changed

 

------------------------------------

 

 

2016-10-21 03:12:47.079 Init Session:0x7f35a4015860 <FATAL> @v_xxxx_node0003:
{SessionRun} 57V03/5785: Cluster Status Request by 10.165.30.20:54401
HINT: Cluster State: xxxx
UP: 3 of 3 (v_xxxx_node0001, v_xxxx_node0002, v_xxxx_node0003)
----
LOCATION: initSession, /scratch_a/release/30493/vbuild/vertica/Session/ClientSession.cpp:436
2016-10-21 03:13:02.585 Init Session:0x7f35a4012c40 <LOG> @v_xxxx_node0003: 00000/2705: Connection received: host=10.165.30.20 port=54405 (connCnt 3)
2016-10-21 03:13:02.585 Init Session:0x7f35a4012c40 <LOG> @v_xxxx_node0003: 00000/4540: Received SSL negotiation startup packet
2016-10-21 03:13:02.585 Init Session:0x7f35a4012c40 <LOG> @v_xxxx_node0003: 00000/4691: Sending SSL negotiation response 'N'
2016-10-21 03:13:02.586 Init Session:0x7f35a4012c40 <FATAL> @v_xxxx_node0003: {SessionRun}
57V03/5785: Cluster Status Request by 10.165.30.20:54405
HINT: Cluster State: xxxx
UP: 3 of 3 (v_xxxx_node0001, v_xxxx_node0002, v_xxxx_node0003)
LOCATION: initSession, /scratch_a/release/30493/vbuild/vertica/Session/ClientSession.cpp:436
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> Saw membership message 8192 on V:xxxx
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> Saw membership message 8192 on Vertica:all
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> Saw membership message 8192 on Vertica:join
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> Saw membership message 6144 on V:xxxx
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> NETWORK change with 1 VS sets
2016-10-21 03:13:16.260 Spread Client:0x8f5a940 [Comms] <INFO> DB Group changed

 

 

---------------------------

2016-10-18 15:23:21.170 Init Session:0x7f46ec019690-b000000196f6a4 [Txn] <INFO> Starting Commit: Txn: b000000196f6a4 ' select count(node_name) as 'Low' FROM (select node_name, round(used_bytes_end_value*100.00/(used_bytes_end_value+free_bytes_end_value), 2.0) as val_used from v_internal.dc_storage_info_by_minute where time > sysdate() - interval '1 min' and filesystem = 'vertica' order by node_name) as disk_use WHERE disk_use.val_used > (select current_value from configuration_parameters where parameter_name = 'LowDiskSpaceWarningPct')'
2016-10-18 15:23:21.172 Init Session:0x7f46ec019690 [Txn] <INFO> Commit Complete: Txn: b000000196f6a4 at epoch 0x24d6df
2016-10-18 15:23:21.175 Init Session:0x7f46ec019690 [Session] <INFO> [PQuery] TX:0(verticanode2-8536:0x93a86) select CONCAT (cast(count(node_name) as varchar(12)), ' Nodes') as 'Total' from v_internal.dc_storage_info_by_minute where time > sysdate() - interval '1 min' and filesystem = 'vertica'
2016-10-18 15:23:21.176 Init Session:0x7f46ec019690 [Session] <INFO> [Query] TX:0(verticanode2-8536:0x93a86) select CONCAT (cast(count(node_name) as varchar(12)), ' Nodes') as 'Total' from v_internal.dc_storage_info_by_minute where time > sysdate() - interval '1 min' and filesystem = 'vertica'
2016-10-18 15:23:21.180 Init Session:0x7f46ec019690-b000000196f6b9 [Txn] <INFO> Begin Txn: b000000196f6b9 ' select CONCAT (cast(count(node_name) as varchar(12)), ' Nodes') as 'Total' from v_internal.dc_storage_info_by_minute where time > sysdate() - interval '1 min' and filesystem = 'vertica''
2016-10-18 15:23:21.465 Init Session:0x7f46ec013ac0-b000000196f6b7 [EE] <INFO> SortManager found maxMerges 7 too small(0 MB Assigned).
2016-10-18 15:23:21.465 Init Session:0x7f46ec013ac0-b000000196f6b7 [EE] <INFO> After disabling optimization, maxMerges becomes 7.
2016-10-18 15:23:29.001 ManageEpochs:0x7f46d0014d90 [Util] <INFO> Task 'ManageEpochs' enabled
2016-10-18 15:23:29.002 AnalyzeRowCount:0x7f46d0013be0 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2016-10-18 15:23:38.238 Spread Client:0x9694f80 [Comms] <INFO> Saw membership message 8192 on V:xxxx
2016-10-18 15:23:38.238 Spread Client:0x9694f80 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-18 15:23:38.238 Spread Client:0x9694f80 [Comms] <INFO> Saw membership message 8192 on Vertica:all
2016-10-18 15:23:38.238 Spread Client:0x9694f80 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-18 15:23:38.238 Spread Client:0x9694f80 [Comms] <INFO> Saw membership message 8192 on Vertica:join
2016-10-18 15:23:38.238 Spread Client:0x9694f80 [Comms] <INFO> Saw transitional message; watch for lost daemons
2016-10-18 15:23:38.253 Spread Client:0x9694f80 [Comms] <INFO> Saw membership message 6144 on xxxx
2016-10-18 15:23:38.253 Spread Client:0x9694f80 [Comms] <INFO> NETWORK change with 1 VS sets
2016-10-18 15:23:38.253 Spread Client:0x9694f80 [Comms] <INFO> DB Group changed

 

 

Has someone any idea?

Thank you,

Veronica

Comments

  • I found:

    http://www.ca.com/us/services-support/ca-support/ca-support-online/knowledge-base-articles.TEC1825493.html

     

    Network related vertica outage

    Question: How can we determine if the cause of a vertica outage was network related?
    Answer: The following types of messages are seen in the vertica.log if there were network related problems

    2016-01-18 10:35:29.903 Spread Client:0x1238ac00 [Comms] <INFO> Saw membership message 8192 on V:drdata
    2016-01-18 10:35:29.903 Spread Client:0x1238ac00 [Comms] <INFO> Saw transitional message; watch for lost daemons
    2016-01-18 10:35:29.904 Spread Client:0x1238ac00 [Comms] <INFO> Saw membership message 8192 on Vertica:all
    2016-01-18 10:35:29.904 Spread Client:0x1238ac00 [Comms] <INFO> Saw transitional message; watch for lost daemons
    2016-01-18 10:35:29.904 Spread Client:0x1238ac00 [Comms] <INFO> Saw membership message 8192 on Vertica:join
    2016-01-18 10:35:29.904 Spread Client:0x1238ac00 [Comms] <INFO> Saw transitional message; watch for lost daemons

  • The solution was the machines restart.

     

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file