Options

One Vertica Node is down

I installed 3-node Vertica cluster. One of the node is down but not able to figure out the root cause. Anyone can help please. I paste the vertica.log below. Thank you!


2015-01-07 12:37:27.007 LGELaggingCheck:0x7f09e8013540-a0000000001de8 [Txn] <INFO> Starting Commit: Txn: a0000000001de8 'Recovery: Get last good epoch'2015-01-07 12:37:27.008 LGELaggingCheck:0x7f09e8013540 [Txn] <INFO> Commit Complete: Txn: a0000000001de8 at epoch 0x2
2015-01-07 12:37:27.008 LGELaggingCheck:0x7f09e8013540-a0000000001de9 [Txn] <INFO> Begin Txn: a0000000001de9 'Check LGE'
2015-01-07 12:37:27.008 LGELaggingCheck:0x7f09e8013540-a0000000001de9 [Txn] <INFO> Rollback Txn: a0000000001de9 'Check LGE'
2015-01-07 12:37:27.009 LGELaggingCheck:0x7f09e8013540 [Util] <INFO> Task 'LGELaggingCheck' enabled
2015-01-07 12:37:30.013 DistCall Dispatch:0x7f09d8005790-c0000000004063 [Txn] <INFO> Rollback Txn: c0000000004063 'rebalance_cluster(background)'
2015-01-07 12:38:27.005 AnalyzeRowCount:0x7f09e80133e0 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:38:30.009 DistCall Dispatch:0x7f09d80167d0-a0000000001dea [Txn] <INFO> Begin Txn: a0000000001dea 'getRowCountsForProj'
2015-01-07 12:38:30.012 DistCall Dispatch:0x7f09d80167d0-a0000000001dea [Txn] <INFO> Rollback Txn: a0000000001dea 'getRowCountsForProj'
2015-01-07 12:38:30.037 DistCall Dispatch:0x7f09d80167d0-c0000000004065 [Txn] <INFO> Rollback Txn: c0000000004065 'do_analyze_row_count'
2015-01-07 12:39:27.005 ManageEpochs:0x7f09e80116e0 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Catalog] <INFO> Found no missed alter partition events
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Catalog] <INFO> Checking for missed restore table events
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Catalog] <INFO> Found no missed restore table events
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Catalog] <INFO> Checking for missed replace node events
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Catalog] <INFO> Found no missed replace node events
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 12:39:30.024 DistCall Dispatch:0x7f09d801e290-c0000000004066 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 12:39:30.033 DistCall Dispatch:0x7f09d801e290-c0000000004066 [Txn] <INFO> Rollback Txn: c0000000004066 'Manage Epochs: Advance AHM'
2015-01-07 12:41:27.009 AnalyzeRowCount:0x7f09e8015060 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:41:30.003 DistCall Dispatch:0x7f09d80167d0-a0000000001deb [Txn] <INFO> Begin Txn: a0000000001deb 'getRowCountsForProj'
2015-01-07 12:41:30.006 DistCall Dispatch:0x7f09d80167d0-a0000000001deb [Txn] <INFO> Rollback Txn: a0000000001deb 'getRowCountsForProj'
2015-01-07 12:41:30.031 DistCall Dispatch:0x7f09d80167d0-c0000000004068 [Txn] <INFO> Rollback Txn: c0000000004068 'do_analyze_row_count'
2015-01-07 12:42:27.007 ManageEpochs:0x7f09e8015060 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 12:42:27.008 RebalanceCluster:0x7f09e80149d0 [Util] <INFO> Task 'RebalanceCluster' enabled
2015-01-07 12:42:27.012 LGELaggingCheck:0x7f09e80116e0-a0000000001dec [Txn] <INFO> Begin Txn: a0000000001dec 'Recovery: Get last good epoch'
2015-01-07 12:42:27.013 LGELaggingCheck:0x7f09e80116e0-a0000000001dec [Txn] <INFO> Starting Commit: Txn: a0000000001dec 'Recovery: Get last good epoch'
2015-01-07 12:42:27.014 LGELaggingCheck:0x7f09e80116e0 [Txn] <INFO> Commit Complete: Txn: a0000000001dec at epoch 0x2
2015-01-07 12:42:27.015 LGELaggingCheck:0x7f09e80116e0-a0000000001ded [Txn] <INFO> Begin Txn: a0000000001ded 'Check LGE'
2015-01-07 12:42:27.016 LGELaggingCheck:0x7f09e80116e0-a0000000001ded [Txn] <INFO> Rollback Txn: a0000000001ded 'Check LGE'
2015-01-07 12:42:27.017 LGELaggingCheck:0x7f09e80116e0 [Util] <INFO> Task 'LGELaggingCheck' enabled
2015-01-07 12:42:30.024 DistCall Dispatch:0x7f09d8018400-c000000000406a [Txn] <INFO> Rollback Txn: c000000000406a 'rebalance_cluster(background)'
2015-01-07 12:42:30.036 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 12:42:30.036 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Catalog] <INFO> Found no missed alter partition events
2015-01-07 12:42:30.036 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Catalog] <INFO> Checking for missed restore table events
2015-01-07 12:42:30.036 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Catalog] <INFO> Found no missed restore table events
2015-01-07 12:42:30.036 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Catalog] <INFO> Checking for missed replace node events
2015-01-07 12:42:30.036 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Catalog] <INFO> Found no missed replace node events
2015-01-07 12:42:30.037 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 12:42:30.037 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 12:42:30.037 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 12:42:30.037 DistCall Dispatch:0x7f09d8028230-c0000000004069 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 12:42:30.042 DistCall Dispatch:0x7f09d8028230-c0000000004069 [Txn] <INFO> Rollback Txn: c0000000004069 'Manage Epochs: Advance AHM'
2015-01-07 12:44:27.010 AnalyzeRowCount:0x7f09e8012b70 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:44:30.007 DistCall Dispatch:0x7f09d80167d0-a0000000001dee [Txn] <INFO> Begin Txn: a0000000001dee 'getRowCountsForProj'
2015-01-07 12:44:30.009 DistCall Dispatch:0x7f09d80167d0-a0000000001dee [Txn] <INFO> Rollback Txn: a0000000001dee 'getRowCountsForProj'
2015-01-07 12:44:30.027 DistCall Dispatch:0x7f09d80167d0-c000000000406e [Txn] <INFO> Rollback Txn: c000000000406e 'do_analyze_row_count'
2015-01-07 12:45:27.006 ManageEpochs:0x7f09e8011f00 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 12:45:30.028 DistCall Dispatch:0x7f09d801e290-c000000000406f [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 12:45:30.028 DistCall Dispatch:0x7f09d801e290-c000000000406f [Catalog] <INFO> Found no missed alter partition events
2015-01-07 12:45:30.028 DistCall Dispatch:0x7f09d801e290-c000000000406f [Catalog] <INFO> Checking for missed restore table events
2015-01-07 12:45:30.028 DistCall Dispatch:0x7f09d801e290-c000000000406f [Catalog] <INFO> Found no missed restore table events
2015-01-07 12:45:30.029 DistCall Dispatch:0x7f09d801e290-c000000000406f [Catalog] <INFO> Checking for missed replace node events
2015-01-07 12:45:30.029 DistCall Dispatch:0x7f09d801e290-c000000000406f [Catalog] <INFO> Found no missed replace node events
2015-01-07 12:45:30.029 DistCall Dispatch:0x7f09d801e290-c000000000406f [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 12:45:30.029 DistCall Dispatch:0x7f09d801e290-c000000000406f [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 12:45:30.029 DistCall Dispatch:0x7f09d801e290-c000000000406f [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 12:45:30.029 DistCall Dispatch:0x7f09d801e290-c000000000406f [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 12:45:30.036 DistCall Dispatch:0x7f09d801e290-c000000000406f [Txn] <INFO> Rollback Txn: c000000000406f 'Manage Epochs: Advance AHM'
2015-01-07 12:47:27.006 RebalanceCluster:0x7f09e8011f00 [Util] <INFO> Task 'RebalanceCluster' enabled
2015-01-07 12:47:27.007 TM Moveout:0x7f09e80133e0-a0000000001def [Txn] <INFO> Begin Txn: a0000000001def 'Moveout: Tuple Mover'
2015-01-07 12:47:27.010 AnalyzeRowCount:0x7f09e8014890 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:47:27.010 LGELaggingCheck:0x7f09e8012f70-a0000000001df0 [Txn] <INFO> Begin Txn: a0000000001df0 'Recovery: Get last good epoch'
2015-01-07 12:47:27.011 LGELaggingCheck:0x7f09e8012f70-a0000000001df0 [Txn] <INFO> Starting Commit: Txn: a0000000001df0 'Recovery: Get last good epoch'
2015-01-07 12:47:27.011 LGELaggingCheck:0x7f09e8012f70 [Txn] <INFO> Commit Complete: Txn: a0000000001df0 at epoch 0x2
2015-01-07 12:47:27.011 LGELaggingCheck:0x7f09e8012f70-a0000000001df1 [Txn] <INFO> Begin Txn: a0000000001df1 'Check LGE'
2015-01-07 12:47:27.011 LGELaggingCheck:0x7f09e8012f70-a0000000001df1 [Txn] <INFO> Rollback Txn: a0000000001df1 'Check LGE'
2015-01-07 12:47:27.011 LGELaggingCheck:0x7f09e8012f70 [Util] <INFO> Task 'LGELaggingCheck' enabled
2015-01-07 12:47:27.012 TM Moveout:0x7f09e80133e0-a0000000001def [Txn] <INFO> Rollback Txn: a0000000001def 'Moveout: Tuple Mover'
2015-01-07 12:47:27.012 TM Mergeout(00):0x7f09e80116e0-a0000000001df2 [Txn] <INFO> Begin Txn: a0000000001df2 'Mergeout: Tuple Mover'
2015-01-07 12:47:27.013 TM Moveout:0x7f09e80133e0 [Util] <INFO> Task 'TM Moveout' enabled
2015-01-07 12:47:27.013 TM Mergeout(01):0x7f09e8013260-a0000000001df3 [Txn] <INFO> Begin Txn: a0000000001df3 'Mergeout: Tuple Mover'
2015-01-07 12:47:27.014 TM Mergeout(00):0x7f09e80116e0-a0000000001df2 [TM] <INFO> Tuple Mover: nothing to merge out
2015-01-07 12:47:27.014 TM Mergeout(00):0x7f09e80116e0-a0000000001df2 [TM] <INFO> Tuple Mover: no DV to merge out
2015-01-07 12:47:27.014 TM Mergeout(00):0x7f09e80116e0-a0000000001df2 [Txn] <INFO> Rollback Txn: a0000000001df2 'Mergeout: Tuple Mover'
2015-01-07 12:47:27.014 TM Mergeout(00):0x7f09e80116e0-a0000000001df4 [Txn] <INFO> Begin Txn: a0000000001df4 'collectMoveStorageJobs'
2015-01-07 12:47:27.020 TM Mergeout(00):0x7f09e80116e0-a0000000001df4 [Txn] <INFO> Rollback Txn: a0000000001df4 'collectMoveStorageJobs'
2015-01-07 12:47:27.021 TM Mergeout(01):0x7f09e8013260-a0000000001df3 [TM] <INFO> Tuple Mover: nothing to merge out
2015-01-07 12:47:27.021 TM Mergeout(00):0x7f09e80116e0 [Util] <INFO> Task 'TM Mergeout(00)' enabled
2015-01-07 12:47:27.021 TM Mergeout(01):0x7f09e8013260-a0000000001df3 [TM] <INFO> Tuple Mover: no DV to merge out
2015-01-07 12:47:27.021 TM Mergeout(01):0x7f09e8013260-a0000000001df3 [Txn] <INFO> Rollback Txn: a0000000001df3 'Mergeout: Tuple Mover'
2015-01-07 12:47:27.022 TM Mergeout(01):0x7f09e8013260 [Util] <INFO> Task 'TM Mergeout(01)' enabled
2015-01-07 12:47:30.014 DistCall Dispatch:0x7f09d8016f80-a0000000001df5 [Txn] <INFO> Begin Txn: a0000000001df5 'getRowCountsForProj'
2015-01-07 12:47:30.015 DistCall Dispatch:0x7f09d8016f80-a0000000001df5 [Txn] <INFO> Rollback Txn: a0000000001df5 'getRowCountsForProj'
2015-01-07 12:47:30.015 DistCall Dispatch:0x7f09d8028f60-c0000000004073 [Txn] <INFO> Rollback Txn: c0000000004073 'rebalance_cluster(background)'
2015-01-07 12:47:30.055 DistCall Dispatch:0x7f09d8016f80-c0000000004078 [Txn] <INFO> Rollback Txn: c0000000004078 'do_analyze_row_count'
2015-01-07 12:48:27.010 ManageEpochs:0x7f09e8012b70 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Catalog] <INFO> Found no missed alter partition events
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Catalog] <INFO> Checking for missed restore table events
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Catalog] <INFO> Found no missed restore table events
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Catalog] <INFO> Checking for missed replace node events
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Catalog] <INFO> Found no missed replace node events
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 12:48:30.020 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 12:48:30.040 DistCall Dispatch:0x7f09d8008e40-c0000000004079 [Txn] <INFO> Rollback Txn: c0000000004079 'Manage Epochs: Advance AHM'
2015-01-07 12:50:27.007 AnalyzeRowCount:0x7f09e80149d0 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:50:30.005 DistCall Dispatch:0x7f09d80167d0-a0000000001df6 [Txn] <INFO> Begin Txn: a0000000001df6 'getRowCountsForProj'
2015-01-07 12:50:30.007 DistCall Dispatch:0x7f09d80167d0-a0000000001df6 [Txn] <INFO> Rollback Txn: a0000000001df6 'getRowCountsForProj'
2015-01-07 12:50:30.025 DistCall Dispatch:0x7f09d80167d0-c000000000407b [Txn] <INFO> Rollback Txn: c000000000407b 'do_analyze_row_count'
2015-01-07 12:51:27.007 ManageEpochs:0x7f09e8015060 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Catalog] <INFO> Found no missed alter partition events
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Catalog] <INFO> Checking for missed restore table events
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Catalog] <INFO> Found no missed restore table events
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Catalog] <INFO> Checking for missed replace node events
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Catalog] <INFO> Found no missed replace node events
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 12:51:30.017 DistCall Dispatch:0x7f09d80281f0-c000000000407c [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 12:51:30.026 DistCall Dispatch:0x7f09d80281f0-c000000000407c [Txn] <INFO> Rollback Txn: c000000000407c 'Manage Epochs: Advance AHM'
2015-01-07 12:52:27.009 LGELaggingCheck:0x7f09e8012b70-a0000000001df7 [Txn] <INFO> Begin Txn: a0000000001df7 'Recovery: Get last good epoch'
2015-01-07 12:52:27.009 LGELaggingCheck:0x7f09e8012b70-a0000000001df7 [Txn] <INFO> Starting Commit: Txn: a0000000001df7 'Recovery: Get last good epoch'
2015-01-07 12:52:27.009 LGELaggingCheck:0x7f09e8012b70 [Txn] <INFO> Commit Complete: Txn: a0000000001df7 at epoch 0x2
2015-01-07 12:52:27.009 LGELaggingCheck:0x7f09e8012b70-a0000000001df8 [Txn] <INFO> Begin Txn: a0000000001df8 'Check LGE'
2015-01-07 12:52:27.009 LGELaggingCheck:0x7f09e8012b70-a0000000001df8 [Txn] <INFO> Rollback Txn: a0000000001df8 'Check LGE'
2015-01-07 12:52:27.010 LGELaggingCheck:0x7f09e8012b70 [Util] <INFO> Task 'LGELaggingCheck' enabled
2015-01-07 12:52:27.011 RebalanceCluster:0x7f09e8013260 [Util] <INFO> Task 'RebalanceCluster' enabled
2015-01-07 12:52:30.015 DistCall Dispatch:0x7f09d80281f0-c000000000407f [Txn] <INFO> Rollback Txn: c000000000407f 'rebalance_cluster(background)'
2015-01-07 12:53:27.008 AnalyzeRowCount:0x7f09e8013540 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:53:30.005 DistCall Dispatch:0x7f09d80169e0-a0000000001df9 [Txn] <INFO> Begin Txn: a0000000001df9 'getRowCountsForProj'
2015-01-07 12:53:30.008 DistCall Dispatch:0x7f09d80169e0-a0000000001df9 [Txn] <INFO> Rollback Txn: a0000000001df9 'getRowCountsForProj'
2015-01-07 12:53:30.029 DistCall Dispatch:0x7f09d80169e0-c0000000004081 [Txn] <INFO> Rollback Txn: c0000000004081 'do_analyze_row_count'
2015-01-07 12:54:27.006 ManageEpochs:0x7f09e80133e0 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Catalog] <INFO> Found no missed alter partition events
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Catalog] <INFO> Checking for missed restore table events
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Catalog] <INFO> Found no missed restore table events
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Catalog] <INFO> Checking for missed replace node events
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Catalog] <INFO> Found no missed replace node events
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 12:54:30.015 DistCall Dispatch:0x7f09d801e250-c0000000004082 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 12:54:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004082 [Txn] <INFO> Rollback Txn: c0000000004082 'Manage Epochs: Advance AHM'
2015-01-07 12:56:27.021 AnalyzeRowCount:0x7f09e8015060 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:56:30.025 DistCall Dispatch:0x7f09d80169e0-a0000000001dfa [Txn] <INFO> Begin Txn: a0000000001dfa 'getRowCountsForProj'
2015-01-07 12:56:30.027 DistCall Dispatch:0x7f09d80169e0-a0000000001dfa [Txn] <INFO> Rollback Txn: a0000000001dfa 'getRowCountsForProj'
2015-01-07 12:56:30.053 DistCall Dispatch:0x7f09d80169e0-c0000000004084 [Txn] <INFO> Rollback Txn: c0000000004084 'do_analyze_row_count'
2015-01-07 12:57:27.042 LGELaggingCheck:0x7f09e80133e0-a0000000001dfb [Txn] <INFO> Begin Txn: a0000000001dfb 'Recovery: Get last good epoch'
2015-01-07 12:57:27.042 RebalanceCluster:0x7f09e8012b70 [Util] <INFO> Task 'RebalanceCluster' enabled
2015-01-07 12:57:27.046 LGELaggingCheck:0x7f09e80133e0-a0000000001dfb [Txn] <INFO> Starting Commit: Txn: a0000000001dfb 'Recovery: Get last good epoch'
2015-01-07 12:57:27.047 LGELaggingCheck:0x7f09e80133e0 [Txn] <INFO> Commit Complete: Txn: a0000000001dfb at epoch 0x2
2015-01-07 12:57:27.048 ManageEpochs:0x7f09e80149d0 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 12:57:27.049 LGELaggingCheck:0x7f09e80133e0-a0000000001dfc [Txn] <INFO> Begin Txn: a0000000001dfc 'Check LGE'
2015-01-07 12:57:27.050 LGELaggingCheck:0x7f09e80133e0-a0000000001dfc [Txn] <INFO> Rollback Txn: a0000000001dfc 'Check LGE'
2015-01-07 12:57:27.050 LGELaggingCheck:0x7f09e80133e0 [Util] <INFO> Task 'LGELaggingCheck' enabled
2015-01-07 12:57:30.020 DistCall Dispatch:0x7f09d8018400-c0000000004088 [Txn] <INFO> Rollback Txn: c0000000004088 'rebalance_cluster(background)'
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Catalog] <INFO> Found no missed alter partition events
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Catalog] <INFO> Checking for missed restore table events
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Catalog] <INFO> Found no missed restore table events
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Catalog] <INFO> Checking for missed replace node events
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Catalog] <INFO> Found no missed replace node events
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 12:57:30.035 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 12:57:30.044 DistCall Dispatch:0x7f09d80281f0-c0000000004087 [Txn] <INFO> Rollback Txn: c0000000004087 'Manage Epochs: Advance AHM'
2015-01-07 12:59:27.005 AnalyzeRowCount:0x7f09e8013260 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 12:59:30.012 DistCall Dispatch:0x7f09d80169e0-a0000000001dfd [Txn] <INFO> Begin Txn: a0000000001dfd 'getRowCountsForProj'
2015-01-07 12:59:30.018 DistCall Dispatch:0x7f09d80169e0-a0000000001dfd [Txn] <INFO> Rollback Txn: a0000000001dfd 'getRowCountsForProj'
2015-01-07 12:59:30.043 DistCall Dispatch:0x7f09d80169e0-c000000000408a [Txn] <INFO> Rollback Txn: c000000000408a 'do_analyze_row_count'
2015-01-07 13:00:27.007 ManageEpochs:0x7f09e8013260 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Catalog] <INFO> Found no missed alter partition events
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Catalog] <INFO> Checking for missed restore table events
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Catalog] <INFO> Found no missed restore table events
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Catalog] <INFO> Checking for missed replace node events
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Catalog] <INFO> Found no missed replace node events
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 13:00:30.022 DistCall Dispatch:0x7f09d801e250-c000000000408b [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 13:00:30.028 DistCall Dispatch:0x7f09d801e250-c000000000408b [Txn] <INFO> Rollback Txn: c000000000408b 'Manage Epochs: Advance AHM'
2015-01-07 13:02:27.006 RebalanceCluster:0x7f09e80149d0 [Util] <INFO> Task 'RebalanceCluster' enabled
2015-01-07 13:02:27.007 AnalyzeRowCount:0x7f09e8013260 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 13:02:27.007 LGELaggingCheck:0x7f09e80133e0-a0000000001dfe [Txn] <INFO> Begin Txn: a0000000001dfe 'Recovery: Get last good epoch'
2015-01-07 13:02:27.008 LGELaggingCheck:0x7f09e80133e0-a0000000001dfe [Txn] <INFO> Starting Commit: Txn: a0000000001dfe 'Recovery: Get last good epoch'
2015-01-07 13:02:27.008 LGELaggingCheck:0x7f09e80133e0 [Txn] <INFO> Commit Complete: Txn: a0000000001dfe at epoch 0x2
2015-01-07 13:02:27.008 LGELaggingCheck:0x7f09e80133e0-a0000000001dff [Txn] <INFO> Begin Txn: a0000000001dff 'Check LGE'
2015-01-07 13:02:27.009 LGELaggingCheck:0x7f09e80133e0-a0000000001dff [Txn] <INFO> Rollback Txn: a0000000001dff 'Check LGE'
2015-01-07 13:02:27.010 LGELaggingCheck:0x7f09e80133e0 [Util] <INFO> Task 'LGELaggingCheck' enabled
2015-01-07 13:02:30.012 DistCall Dispatch:0x7f09d80136a0-a0000000001e00 [Txn] <INFO> Begin Txn: a0000000001e00 'getRowCountsForProj'
2015-01-07 13:02:30.013 DistCall Dispatch:0x7f09d80136a0-a0000000001e00 [Txn] <INFO> Rollback Txn: a0000000001e00 'getRowCountsForProj'
2015-01-07 13:02:30.015 DistCall Dispatch:0x7f09d801e250-c000000000408c [Txn] <INFO> Rollback Txn: c000000000408c 'rebalance_cluster(background)'
2015-01-07 13:02:30.028 DistCall Dispatch:0x7f09d80136a0-c0000000004090 [Txn] <INFO> Rollback Txn: c0000000004090 'do_analyze_row_count'
2015-01-07 13:03:27.003 ManageEpochs:0x7f09e80149d0 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Catalog] <INFO> Found no missed alter partition events
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Catalog] <INFO> Checking for missed restore table events
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Catalog] <INFO> Found no missed restore table events
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Catalog] <INFO> Checking for missed replace node events
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Catalog] <INFO> Found no missed replace node events
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 13:03:30.021 DistCall Dispatch:0x7f09d801e250-c0000000004091 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 13:03:30.029 DistCall Dispatch:0x7f09d801e250-c0000000004091 [Txn] <INFO> Rollback Txn: c0000000004091 'Manage Epochs: Advance AHM'
2015-01-07 13:05:27.009 AnalyzeRowCount:0x7f09e80116e0 [Util] <INFO> Task 'AnalyzeRowCount' enabled
2015-01-07 13:05:30.008 DistCall Dispatch:0x7f09d80169e0-a0000000001e01 [Txn] <INFO> Begin Txn: a0000000001e01 'getRowCountsForProj'
2015-01-07 13:05:30.011 DistCall Dispatch:0x7f09d80169e0-a0000000001e01 [Txn] <INFO> Rollback Txn: a0000000001e01 'getRowCountsForProj'
2015-01-07 13:05:30.031 DistCall Dispatch:0x7f09d80169e0-c0000000004093 [Txn] <INFO> Rollback Txn: c0000000004093 'do_analyze_row_count'
2015-01-07 13:06:27.003 ManageEpochs:0x7f09e8015060 [Util] <INFO> Task 'ManageEpochs' enabled
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Catalog] <INFO> Checking for missed alter partition events
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Catalog] <INFO> Found no missed alter partition events
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Catalog] <INFO> Checking for missed restore table events
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Catalog] <INFO> Found no missed restore table events
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Catalog] <INFO> Checking for missed replace node events
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Catalog] <INFO> Found no missed replace node events
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
2015-01-07 13:06:30.018 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
2015-01-07 13:06:30.019 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
2015-01-07 13:06:30.026 DistCall Dispatch:0x7f09d80281f0-c0000000004094 [Txn] <INFO> Rollback Txn: c0000000004094 'Manage Epochs: Advance AHM'
2015-01-07 13:07:27.014 TM Moveout:0x7f09e80116e0-a0000000001e02 [Txn] <INFO> Begin Txn: a0000000001e02 'Moveout: Tuple Mover'
2015-01-07 13:07:27.016 TM Moveout:0x7f09e80116e0-a0000000001e02 [Txn] <INFO> Rollback Txn: a0000000001e02 'Moveout: Tuple Mover'
2015-01-07 13:07:27.022 TM Mergeout(01):0x7f09e8015060-a0000000001e03 [Txn] <INFO> Begin Txn: a0000000001e03 'Mergeout: Tuple Mover'
2015-01-07 13:07:27.024 TM Mergeout(00):0x7f09e8013260-a0000000001e04 [Txn] <INFO> Begin Txn: a0000000001e04 'Mergeout: Tuple Mover'
2015-01-07 13:07:27.025 TM Moveout:0x7f09e80116e0 [Util] <INFO> Task 'TM Moveout' enabled
2015-01-07 13:07:27.027 TM Mergeout(00):0x7f09e8013260-a0000000001e04 [TM] <INFO> Tuple Mover: nothing to merge out
2015-01-07 13:07:27.027 TM Mergeout(00):0x7f09e8013260-a0000000001e04 [TM] <INFO> Tuple Mover: no DV to merge out
2015-01-07 13:07:27.027 TM Mergeout(00):0x7f09e8013260-a0000000001e04 [Txn] <INFO> Rollback Txn: a0000000001e04 'Mergeout: Tuple Mover'
2015-01-07 13:07:27.027 TM Mergeout(00):0x7f09e8013260-a0000000001e05 [Txn] <INFO> Begin Txn: a0000000001e05 'collectMoveStorageJobs'
2015-01-07 13:07:27.031 RebalanceCluster:0x7f09e80133e0 [Util] <INFO> Task 'RebalanceCluster' enabled
2015-01-07 13:07:27.034 TM Mergeout(01):0x7f09e8015060-a0000000001e03 [TM] <INFO> Tuple Mover: nothing to merge out
2015-01-07 13:07:27.035 TM Mergeout(01):0x7f09e8015060-a0000000001e03 [TM] <INFO> Tuple Mover: no DV to merge out
2015-01-07 13:07:27.035 TM Mergeout(01):0x7f09e8015060-a0000000001e03 [Txn] <INFO> Rollback Txn: a0000000001e03 'Mergeout: Tuple Mover'
2015-01-07 13:07:27.037 LGELaggingCheck:0x7f09e8012b70-a0000000001e06 [Txn] <INFO> Begin Txn: a0000000001e06 'Recovery: Get last good epoch'
2015-01-07 13:07:27.037 LGELaggingCheck:0x7f09e8012b70-a0000000001e06 [Txn] <INFO> Starting Commit: Txn: a0000000001e06 'Recovery: Get last good epoch'
2015-01-07 13:07:27.037 LGELaggingCheck:0x7f09e8012b70 [Txn] <INFO> Commit Complete: Txn: a0000000001e06 at epoch 0x2
2015-01-07 13:07:27.039 TM Mergeout(00):0x7f09e8013260-a0000000001e05 [Txn] <INFO> Rollback Txn: a0000000001e05 'collectMoveStorageJobs'
2015-01-07 13:07:27.039 TM Mergeo

Comments

  • Options
    Hi

    Can you kindly let us know whether you restarted the down node using admintools?Is it up now or still down?

    Provide us the output of below command:

    $ admintools -t list_allnodes
    $ admintools -t view_cluster

    Also this vertica.log snippet doesn't indicate much as it is not complete.So in order to look for the possible reason behind this crash will request you to kindly provide us the vertica.log file messages from the time when node crashed. Can you check for any Panic message or provide us the all the lines (whatever possible at max.) before the time of the node shutdown?


    Regards
    Rahul




  • Options
    Hi Rahul,

    I restarted this node. It's up now.

    $ admintools -t list_allnodes Node                  | Host         | State | Version           | DB         
    -----------------------+--------------+-------+-------------------+------------
     v_verticabip_node0001 | 15.50.17.123 | UP    | vertica-7.1.1.000 | verticabip 
     v_verticabip_node0002 | 15.50.5.108  | UP    | vertica-7.1.1.000 | verticabip 
     v_verticabip_node0003 | 15.50.14.194 | UP    | vertica-7.1.1.000 | verticabip 

    $admintools -t view_cluster DB         | Host | State 
    ------------+------+-------
     verticabip | ALL  | UP 


    Below is the entire log from start to down. Please help to analyze. Thank you.

    2015-01-07 02:57:17.696 INFO New log
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 [Init] <INFO> Log /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/vertica.log opened; #1
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 [Init] <INFO> Processing command line: /opt/vertica/bin/vertica -D /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog -C verticabip -n v_verticabip_node0001 -h 15.50.17.123 -p 5433 -P 4803 -Y ipv4
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 [Init] <INFO> Starting up Vertica Analytic Database v7.1.1-0
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 [Init] <INFO> Project Codename: Dragline
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 [Init] <INFO> vertica(v7.1.1-0) built by release@build2.verticacorp.com from releases/VER_7_1_RELEASE_BUILD_1_0_20141016@148158 on 'Thu Oct 16 15:55:09 America/New_York 2014' $BuildId$
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 [Init] <INFO> 64-bit Optimized Build
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 [Init] <INFO> Compiler Version: 4.1.2 20080704 (Red Hat 4.1.2-55)
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 <LOG> @[initializing]: 00000/5081: Total swap memory used: 0
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 <LOG> @[initializing]: 00000/4435: Process size resident set: 28569600
    2015-01-07 02:57:17.696 unknown:0x7ff80d3fc700 <LOG> @[initializing]: 00000/5075: Total Memory free + cache: 6751404032
    2015-01-07 02:57:17.697 unknown:0x7ff80d3fc700 [Txn] <INFO> Looking for catalog at: /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/Catalog
    2015-01-07 02:57:17.697 unknown:0x7ff80d3fc700 [Catalog] <INFO> Loading Checkpoint 1
    2015-01-07 02:57:17.697 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Reading Checkpoint (bytes) - 0 / 5273
    2015-01-07 02:57:17.699 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Reading Checkpoint (bytes) - 5273 / 5273
    2015-01-07 02:57:17.699 unknown:0x7ff80d3fc700 [Catalog] <INFO> Replaying 4 Txnlogs
    2015-01-07 02:57:17.699 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Applying transaction log (bytes) - 0 / 258987
    2015-01-07 02:57:17.707 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Applying transaction log (bytes) - 55427 / 258987
    2015-01-07 02:57:17.722 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Applying transaction log (bytes) - 140216 / 258987
    2015-01-07 02:57:17.740 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Applying transaction log (bytes) - 258987 / 258987
    2015-01-07 02:57:17.740 unknown:0x7ff80d3fc700 [Txn] <INFO> Installing objects...
    2015-01-07 02:57:17.740 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Indexing Objects - 0 / 235
    2015-01-07 02:57:17.741 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Reading Catalog] Indexing Objects - 226 / 235
    2015-01-07 02:57:17.741 unknown:0x7ff80d3fc700 [Txn] <INFO> Catalog loaded from path: /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/Catalog [235 objects, GLOBAL version 122, LOCAL version 12] (no checkpoint needed)
    2015-01-07 02:57:17.742 unknown:0x7ff80d3fc700 [Txn] <INFO> switchToLocalNode: v_verticabip_node0001 with path /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/Catalog
    2015-01-07 02:57:17.742 unknown:0x7ff80d3fc700 [Txn] <INFO> Transaction sequence set, seq num=1844, nodeID=a
    2015-01-07 02:57:17.742 unknown:0x7ff80d3fc700 [Txn] <INFO> Catalog sequence set, seq num=11447, nodeID=a
    2015-01-07 02:57:17.742 unknown:0x7ff80d3fc700 [Txn] <INFO> Found my node (v_verticabip_node0001) in the catalog
    2015-01-07 02:57:17.742 unknown:0x7ff80d3fc700 [Txn] <INFO> Catalog info: version=0x7a, number of nodes=3, permanent #=3, K=1
    2015-01-07 02:57:17.742 unknown:0x7ff80d3fc700 [Txn] <INFO> Catalog info: current epoch=0x2
    2015-01-07 02:57:17.742 unknown:0x7ff80d3fc700 [Init] <INFO> Catalog loaded
    2015-01-07 02:57:17.745 unknown:0x7ff80d3fc700 [Comms] <INFO> About to launch spread with '/opt/vertica/spread/sbin/spread -c /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/spread.conf'
    2015-01-07 02:57:17.746 unknown:0x7ff80d3fc700 [Comms] <INFO> forked spread pid=14404, wrote pidfile /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/spread.pid
    2015-01-07 02:57:17.754 unknown:0x7ff80d3fc700 <LOG> @v_verticabip_node0001: VX001/2832: Could not create IPv6 socket: No such file or directory
    2015-01-07 02:57:17.754 unknown:0x7ff80d3fc700 [Init] <INFO> Listening on port: 5433
    2015-01-07 02:57:17.754 unknown:0x7ff80d3fc700 [Init] <INFO> About to fork
    2015-01-07 02:57:17.756 unknown:0x7ff80d3fc700 [Init] <INFO> About to fork again
    2015-01-07 02:57:17.759 unknown:0x7ff80d3fc700 [Init] <INFO> Completed forking
    2015-01-07 02:57:17.760 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Connecting to Spread] Connecting to spread 4803
    2015-01-07 02:57:17.761 unknown:0x7ff80d3fc700 [Comms] <INFO> connected to spread as #node_a#N015050017123
    2015-01-07 02:57:17.761 unknown:0x7ff80d3fc700 [Init] <INFO> PID=14410
    2015-01-07 02:57:17.761 unknown:0x7ff80d3fc700 [Init] <INFO> Start reading DataCollector information
    2015-01-07 02:57:17.773 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 0 / 459721932
    2015-01-07 02:57:18.195 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 46374231 / 459721932
    2015-01-07 02:57:18.741 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 96188542 / 459721932
    2015-01-07 02:57:19.151 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 146453332 / 459721932
    2015-01-07 02:57:19.486 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 193813439 / 459721932
    2015-01-07 02:57:19.767 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 240861889 / 459721932
    2015-01-07 02:57:20.039 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 288137037 / 459721932
    2015-01-07 02:57:20.573 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 334388201 / 459721932
    2015-01-07 02:57:21.177 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 381919842 / 459721932
    2015-01-07 02:57:21.570 unknown:0x7ff80d3fc700 [Init] <INFO> Startup [Read DataCollector] Inventory files (bytes) - 433933653 / 459721932
    2015-01-07 02:57:21.825 unknown:0x7ff80d3fc700 [Init] <INFO> Done reading DataCollector information
    2015-01-07 02:57:21.826 Main:0x7245930 <WARNING> @v_verticabip_node0001: 57V03/4973: The UDx zygote process is down, restarting it...
    2015-01-07 02:57:21.836 Main:0x7245930 [Init] <INFO> Setting up Messenger service
    2015-01-07 02:57:21.837 Main:0x7245930 [Dist] <INFO> Cluster point-point listener port 5434
    2015-01-07 02:57:21.839 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3885: Locking [/data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/vertica.pid], fdLocked == 12
    2015-01-07 02:57:22.005 Main:0x7245930 [Init] <INFO> Using "UTC" as the default timezone
    2015-01-07 02:57:22.007 Main:0x7245930 [Txn] <INFO> Begin Txn: fff0000000000cc1 'Initialization'
    2015-01-07 02:57:22.009 Main:0x7245930 [Catalog] <INFO> Running SINGLE upgrade tasks v7.1.1-0(148158) -> v7.1.1-0(148158)
    2015-01-07 02:57:22.010 Main:0x7245930 [Catalog] <INFO> License Audit Record Hash Check Passed.
    2015-01-07 02:57:22.010 Main:0x7245930 [Catalog] <INFO> Full License Hash Check Passed.
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool general -  Queries: 10000 Threads: 5386 File Handles: 52339 Memory(KB): 6517528
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool sysquery -  Queries: 10000 Threads: 5439 File Handles: 52857 Memory(KB): 6583064
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool sysdata -  Memory(KB): 689026
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool wosdata -  Memory(KB): 1722566
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool tm -  Queries: 3 Threads: 5552 File Handles: 53958 Memory(KB): 6722328
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool refresh -  Queries: 10000 Threads: 5386 File Handles: 52339 Memory(KB): 6517528
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool recovery -  Queries: 3 Threads: 5386 File Handles: 52339 Memory(KB): 6517528
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool dbd -  Queries: 10000 Threads: 5386 File Handles: 52339 Memory(KB): 6517528
    2015-01-07 02:57:22.011 Main:0x7245930 [ResourceManager] <INFO> pool jvm -  Queries: 10000 Threads: 569 File Handles: 5533 Memory(KB): 689026
    2015-01-07 02:57:22.031 Main:0x7245930 [ResourceManager] <INFO> overallLimits -  Queries: 10000 Threads: 5605 File Handles: 54476 Memory(KB): 6890264
    2015-01-07 02:57:22.031 Main:0x7245930 [ResourceManager] <INFO> Initialized.
    2015-01-07 02:57:22.035 Main:0x7245930 [SAL] <INFO> Block memory manager initialization complete, 24 chunks created
    2015-01-07 02:57:22.035 Main:0x7245930 [SAL] <INFO> Creating Hadoop FS watchdog
    2015-01-07 02:57:22.036 Main:0x7245930 [Init] <INFO> SAL initialization complete
    2015-01-07 02:57:22.036 Main:0x7245930 [SAL] <INFO> Storage Location /data/vertica/data/verticabip/v_verticabip_node0001_data Exists
    2015-01-07 02:57:22.037 Main:0x7245930 [Init] <INFO> Startup [Check Storage] Removing unnecessary storage files
    2015-01-07 02:57:22.098 Main:0x7245930 [Init] <INFO> Startup [Check Storage] Confirming storage matches catalog (files) - 0 / 2
    2015-01-07 02:57:22.099 Main:0x7245930 [Init] <INFO> Startup [Check Storage] Confirming storage matches catalog (files) - 2 / 2
    2015-01-07 02:57:22.099 Main:0x7245930 [Recover] <INFO> Loading UDx libraries
    2015-01-07 02:57:22.099 Main:0x7245930 [Command] <INFO> (fff0000000000cc1) Loading library file [/data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/Libraries/public_FlexTableLib_45035996273736164/public_FlexTableLib_45035996273736164.so]
    2015-01-07 02:57:22.143 Main:0x7245930 [EE] <INFO> Trying to set up a new UDx side process
    2015-01-07 02:57:22.160 Main:0x7245930 [Command] <INFO> (fff0000000000cc1) Loading library file [/data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/Libraries/TxtIndex_TxtIndexLib_45035996273736408/TxtIndex_TxtIndexLib_45035996273736408.so]
    2015-01-07 02:57:22.171 Main:0x7245930 [Command] <INFO> (fff0000000000cc1) Loading library file [/data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/Libraries/public_ApproximateLib_45035996273736508/public_ApproximateLib_45035996273736508.so]
    2015-01-07 02:57:22.181 Main:0x7245930 [Recover] <INFO> Setting up UDx pointers
    2015-01-07 02:57:22.196 Main:0x7245930 [Command] <INFO> (fff0000000000cc1) Library Handle: 0x76c2590
    2015-01-07 02:57:22.197 Main:0x7245930 [Command] <INFO> (fff0000000000cc1) Library Handle: 0x76ca000
    2015-01-07 02:57:22.198 Main:0x7245930 [Command] <INFO> (fff0000000000cc1) Library Handle: 0x76cbb10
    2015-01-07 02:57:31.363 Main:0x7245930 [Txn] <INFO> Starting Commit: Txn: fff0000000000cc1 'Initialization'
    2015-01-07 02:57:31.736 Main:0x7245930 [Txn] <INFO> Commit Complete: Txn: fff0000000000cc1 at epoch 0x2
    2015-01-07 02:57:31.736 Main:0x7245930 [Init] <INFO> Setting up Timer service
    2015-01-07 02:57:31.737 Main:0x7245930 [Util] <INFO> TMTasks: stopped
    2015-01-07 02:57:31.737 Main:0x7245930 [TM] <INFO> Tuple mover: stopped
    2015-01-07 02:57:31.737 Main:0x7245930 [TM] <INFO> Tuple mover initialized 7 individual tasks: 
    2015-01-07 02:57:31.737 Main:0x7245930 [Init] <INFO> Setting up Spread service
    2015-01-07 02:57:31.737 Main:0x7245930 [Comms] <INFO> InitSpread: I am node v_verticabip_node0001 (oid 45035996273704980)
    2015-01-07 02:57:31.737 Main:0x7245930 [Comms] <INFO> InitSpread: Starting at address 15.50.17.123
    2015-01-07 02:57:31.737 Spread Mailbox Dequeue:0x7a76d10 [Comms] <INFO> Spread dequeue thread starting
    2015-01-07 02:57:31.738 Main:0x7245930 [Comms] <INFO> Opening UDP socket on 5433 to send/receive spread monitoring messages
    2015-01-07 02:57:31.738 Main:0x7245930 [Init] <INFO> Setting up Tuple Mover service
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'AnalyzeRowCount' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'LGELaggingCheck' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'ManageEpochs' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'PartitionTables' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'RebalanceCluster' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'TM Moveout' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'TM Mergeout(00)' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> Task 'TM Mergeout(01)' enabled
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> TMTasks: started
    2015-01-07 02:57:31.738 Main:0x7245930 [TM] <INFO> Tuple mover: started
    2015-01-07 02:57:31.738 Main:0x7245930 [Util] <INFO> TMTasks: stopped
    2015-01-07 02:57:31.738 Main:0x7245930 [Init] <INFO> Setting up Recovery assessment service
    2015-01-07 02:57:31.738 Main:0x7245930 [Recover] <INFO> Started startup task: request ID = 9
    2015-01-07 02:57:31.738 Main:0x7245930 [Init] <INFO> Setting up Connection Listener service
    2015-01-07 02:57:31.738 Main:0x7245930 [Init] <INFO> Database character set: UTF-8
    2015-01-07 02:57:31.738 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4443: Processor count: 4
    2015-01-07 02:57:31.738 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4442: Processor core count: 4
    2015-01-07 02:57:31.738 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4444: Processor type: AMD Opteron(TM) Processor 6238                 
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3907: Maximum Memory Regions: 503674
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3910: Maximum Threads for this user: 7869
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3908: Maximum Open Files for this process: 65536
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3906: Maximum File Size for this process: 18446744073709551615
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3904: Maximum Core File for this process: 0
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3903: Maximum Address Space for this process: 18446744073709551615
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3905: Maximum Data Segment for this process: 18446744073709551615
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3909: Maximum Stack Size for this process: 10485760
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3868: Load average (1 min): 0.41
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3870: Load average (5 min): 0.14
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/3869: Load average (15 min): 0.03
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/5079: Total memory: 8252198912
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/5076: Total memory free: 4544069632
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/5073: Total memory buffers: 412639232
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/5074: Total memory cache: 2097348608
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/5082: Total swap memory: 2147475456
    2015-01-07 02:57:31.739 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/5080: Total swap memory free: 2147475456
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4274: Open Files: 8
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4276: Open sockets: 5
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4275: Open other: 6
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4438: Process size: 1891155968
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4435: Process size resident set: 141303808
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4436: Process size shared: 25378816
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4437: Process size text: 77819904
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4432: Process size data&colon; 1651494912
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4434: Process size lib: 0
    2015-01-07 02:57:31.740 Main:0x7245930 <LOG> @v_verticabip_node0001: 00000/4433: Process size dirty: 0
    2015-01-07 02:57:31.741 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/2705: Connection received: host=15.50.17.123 port=38743 (connCnt 1)
    2015-01-07 02:57:31.741 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/4540: Received SSL negotiation startup packet
    2015-01-07 02:57:31.741 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/4691: Sending SSL negotiation response 'N'
    2015-01-07 02:57:31.742 Init Session:0x7ff7a0010fc0 <FATAL> @v_verticabip_node0001: {SessionRun} 57V03/5785: Cluster Status Request by 15.50.17.123:38743
    HINT:  Cluster State: verticabip
    INITIALIZING: 1 of 3 (v_verticabip_node0001)
    ----
    LOCATION:  initSession, /scratch_a/release/30493/vbuild/vertica/Session/ClientSession.cpp:436
    2015-01-07 02:57:31.754 Main:0x7245930 [Util] <INFO> Task 'DBSnapshotRemoval' enabled
    2015-01-07 02:57:31.754 Main:0x7245930 [Util] <INFO> Task 'FeatureUseLogger' enabled
    2015-01-07 02:57:31.754 Main:0x7245930 [Util] <INFO> Task 'LicenseSizeAuditor' enabled
    2015-01-07 02:57:31.754 Main:0x7245930 [Util] <INFO> Task 'LicenseTermAuditor' enabled
    2015-01-07 02:57:31.754 Main:0x7245930 [Util] <INFO> SystemTasks: started
    2015-01-07 02:57:32.000 Cluster Inviter:0x7ff79c0118c0 [Init] <INFO> Startup [Waiting for Cluster Invite] Prepare to be invited
    2015-01-07 02:57:32.002 Spread Client:0x7a7fb70 [Comms] <INFO> Saw membership message 4352 on Vertica:join
    2015-01-07 02:57:32.002 Spread Client:0x7a7fb70 [Comms] <INFO> Joined Vertica:join as #node_a#N015050017123
    2015-01-07 02:57:32.002 Spread Client:0x7a7fb70 [Comms] <INFO> Saw membership message 4352 on Vertica:all
    2015-01-07 02:57:32.002 Spread Client:0x7a7fb70 [Comms] <INFO> Joined V:all as #node_a#N015050017123
    2015-01-07 02:57:32.002 Spread Client:0x7a7fb70 [Init] <INFO> Startup [Waiting for Cluster Invite] Ready to be invited
    2015-01-07 02:57:33.800 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/2705: Connection received: host=15.50.17.123 port=38753 (connCnt 1)
    2015-01-07 02:57:33.800 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/4540: Received SSL negotiation startup packet
    2015-01-07 02:57:33.800 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/4691: Sending SSL negotiation response 'N'
    2015-01-07 02:57:33.801 Init Session:0x7ff7a0010fc0 <FATAL> @v_verticabip_node0001: {SessionRun} 57V03/5785: Cluster Status Request by 15.50.17.123:38753
    HINT:  Cluster State: verticabip
    INITIALIZING: 1 of 3 (v_verticabip_node0001)
    ----
    LOCATION:  initSession, /scratch_a/release/30493/vbuild/vertica/Session/ClientSession.cpp:436
    2015-01-07 02:57:37.028 Spread Client:0x7a7fb70 [Init] <INFO> Startup [Waiting for Cluster Invite] Receiving catalog (chunks) - 1 / 1
    2015-01-07 02:57:37.028 Spread Client:0x7a7fb70 [VMPI] <INFO> Received starting chunk of catalog from 45035996273720622 chunk 1/1 (103956 bytes)
    2015-01-07 02:57:37.036 Spread Client:0x7a7fb70 [VMPI] <INFO> Cluster is being formed; we are invited
    2015-01-07 02:57:37.036 Spread Client:0x7a7fb70 [Init] <INFO> Startup [Waiting for Cluster Invite] Invited
    2015-01-07 02:57:37.036 Spread Client:0x7a7fb70-fff0000000000bca [Txn] <INFO> Begin Txn: fff0000000000bca 'Installing New Catalog'
    2015-01-07 02:57:37.036 Spread Client:0x7a7fb70-fff0000000000bca [Catalog] <INFO> installNewCatalog: Received new catalog, replacing current TRANSACTION-fff0000000000bca catalog (old version=0x7a, new version 0x7b)
    2015-01-07 02:57:37.039 Spread Client:0x7a7fb70-fff0000000000bca [Catalog] <INFO> replaceCatalog: dropping 0 objects
    2015-01-07 02:57:37.042 Spread Client:0x7a7fb70-fff0000000000bca [Catalog] <INFO> Catalog OID generator updated based on SYSTEM tier catalog
    2015-01-07 02:57:37.053 Spread Client:0x7a7fb70-fff0000000000bca [Catalog] <INFO> License Audit Record Hash Check Passed.
    2015-01-07 02:57:37.053 Spread Client:0x7a7fb70-fff0000000000bca [Catalog] <INFO> Full License Hash Check Passed.
    2015-01-07 02:57:37.053 Spread Client:0x7a7fb70-fff0000000000bca [Txn] <INFO> Starting Commit: Txn: fff0000000000bca 'Installing New Catalog'
    2015-01-07 02:57:37.053 Spread Client:0x7a7fb70 [Txn] <INFO> Commit Complete: Txn: fff0000000000bca at epoch 0x2
    2015-01-07 02:57:37.053 Spread Client:0x7a7fb70 [Util] <INFO> TMTasks: started
    2015-01-07 02:57:37.053 Spread Client:0x7a7fb70 [TM] <INFO> Tuple mover: started
    2015-01-07 02:57:37.053 Spread Client:0x7a7fb70 [Util] <INFO> TMTasks: stopped
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [SAL] <INFO> Storage Location /data/vertica/data/verticabip/v_verticabip_node0001_data Exists
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Init] <INFO> Validated Storage Location /data/vertica/data/verticabip/v_verticabip_node0001_data
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Txn] <INFO> Joining DB group with catalog version 123
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Comms] <INFO> joinDBGroup: Node v_verticabip_node0001 (#node_a#N015050017123) joining cluster for DB verticabip
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Comms] <INFO> nodeToState map:
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0003 : INITIALIZING
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Recover] <INFO> State change for node v_verticabip_node0003: INITIALIZING; catalog 123
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Comms] <INFO> nodeToState map:
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0002 : INITIALIZING
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0003 : INITIALIZING
    2015-01-07 02:57:37.054 Spread Client:0x7a7fb70 [Recover] <INFO> State change for node v_verticabip_node0002: INITIALIZING; catalog 123
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO> nodeToState map:
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0001 : INITIALIZING
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0002 : INITIALIZING
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0003 : INITIALIZING
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Recover] <INFO> State change for node v_verticabip_node0001: INITIALIZING; catalog 123
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO> Saw membership message 4352 on V:verticabip
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO> DB Group changed
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [VMPI] <INFO> DistCall: Set current group members called with 3 members
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO> nodeSetNotifier: node v_verticabip_node0003 joined the cluster
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO> nodeSetNotifier: node v_verticabip_node0001 joined the cluster
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO> nodeSetNotifier: node v_verticabip_node0002 joined the cluster
    2015-01-07 02:57:37.056 Spread Client:0x7a7fb70 [Comms] <INFO> trackGroups: I (15.50.17.123, #node_a#N015050017123, v_verticabip_node0001, 45035996273704980) am joining V:db group
    2015-01-07 02:57:37.890 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/2705: Connection received: host=15.50.17.123 port=38761 (connCnt 1)
    2015-01-07 02:57:37.890 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/4540: Received SSL negotiation startup packet
    2015-01-07 02:57:37.890 Init Session:0x7ff7a0010fc0 <LOG> @v_verticabip_node0001: 00000/4691: Sending SSL negotiation response 'N'
    2015-01-07 02:57:37.891 Init Session:0x7ff7a0010fc0 <FATAL> @v_verticabip_node0001: {SessionRun} 57V03/5785: Cluster Status Request by 15.50.17.123:38761
    HINT:  Cluster State: verticabip
    INITIALIZING: 3 of 3 (v_verticabip_node0001, v_verticabip_node0002, v_verticabip_node0003)
    ----
    LOCATION:  initSession, /scratch_a/release/30493/vbuild/vertica/Session/ClientSession.cpp:436
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0 [Recover] <INFO> Started recovery assessment task: request ID = 38
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Txn] <INFO> Begin Txn: a0000000001845 'ProjUtil::getLocalNodeLGE'
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Catalog] <INFO> Checking for missed alter partition events
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Catalog] <INFO> Found no missed alter partition events
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Catalog] <INFO> Checking for missed restore table events
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Catalog] <INFO> Found no missed restore table events
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Catalog] <INFO> Checking for missed replace node events
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Catalog] <INFO> Found no missed replace node events
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0-a0000000001845 [Txn] <INFO> Rollback Txn: a0000000001845 'ProjUtil::getLocalNodeLGE'
    2015-01-07 02:57:38.000 Timer Service:0x7a8bfc0 [Recover] <INFO>   My local node LGE = 0x1 and current epoch = 0x2
    2015-01-07 02:57:38.002 Recover:0x7ff79c0114f0 [Recover] <INFO> Node v_verticabip_node0001 assessing recovery: I am not clerk
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Catalog] <INFO> Checking for missed alter partition events
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Catalog] <INFO> Found no missed alter partition events
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Catalog] <INFO> Checking for missed restore table events
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Catalog] <INFO> Found no missed restore table events
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Catalog] <INFO> Checking for missed replace node events
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Catalog] <INFO> Found no missed replace node events
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Recover] <INFO> Get Local Node LGE: node current cpe is 0x1
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Recover] <INFO> Get Local Node LGE: projection training.test_b1, CPE is 0x1
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Recover] <INFO> Get Local Node LGE: projection training.test_b0, CPE is 0x1
    2015-01-07 02:57:38.022 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x1
    2015-01-07 02:57:38.027 DistCall Dispatch:0x7ff7a40069c0-c0000000003a30 [Txn] <INFO> Rollback Txn: c0000000003a30 'sendGetClusterLGE'
    2015-01-07 02:57:38.110 DistCall Dispatch:0x7ff7a40069c0-c0000000003a31 [Txn] <INFO> Rollback Txn: c0000000003a31 'sendCheckMissingLibraries'
    2015-01-07 02:57:38.129 DistCall Dispatch:0x7ff7a40069c0-c0000000003a32 [Recover] <INFO> Skipping recovery for node v_verticabip_node0001 based on recovery clerk instructions
    2015-01-07 02:57:38.130 DistCall Dispatch:0x7ff7a40069c0-c0000000003a32 <LOG> @v_verticabip_node0001: 00000/3298: Event Posted: Event Code:6 Event Id:7 Event Severity: Informational [6] PostedTimestamp: 2015-01-07 02:57:38.130052 ExpirationTimestamp: 2083-01-25 06:11:45.130052 EventCodeDescription: Node State Change ProblemDescription: Changing node v_verticabip_node0001 startup state to READY DatabaseName: verticabip Hostname: c0049696.itcs.hp.com
    2015-01-07 02:57:38.130 DistCall Dispatch:0x7ff7a40069c0-c0000000003a32 [Recover] <INFO> Changing node v_verticabip_node0001 startup state from INITIALIZING to READY
    2015-01-07 02:57:38.130 DistCall Dispatch:0x7ff7a40069c0-c0000000003a32 [Recover] <INFO> waitForNodes: Cluster is not complete: available=1, needed=3. Waiting for recovery
    2015-01-07 02:57:38.131 Spread Client:0x7a7fb70 [Comms] <INFO> nodeToState map:
    2015-01-07 02:57:38.131 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0001 : READY
    2015-01-07 02:57:38.131 Spread Client:0x7a7fb70 [Comms] <INFO>   v_verticabip_node0002 : READY
    2015-01-07 02:57:38.131 Spread Client:0x7a7fb70 [Comms] <INFO> &
  • Options
    Unfortunately this vertica.log that you have provided contains only information after you have restarted the node at 2015-01-07 around 02:57:17 & doesn't contain the down event of the node (Node 01).So can you please check carefully if this is the complete vertica.log from the crashed node (Node 01).Else can you please check if there is any other vertica.log.gz file located under the same catalog directory that might have contain that shutdown information.If yes please provide the same to us in order for us to analyze this further.

    Regards
    Rahul


  • Options
    Yes, i noticed it's truncated after I posted the log. You should be able view the full log from link below. Thank you!

    https://onedrive.live.com/?cid=A7DE1256AE63944F&id=A7DE1256AE63944F%21110

  • Options
    From the vertica.log uploaded on the above link it looks to be an abrupt shutdown where vertica.log all of sudden stopped writing at 2015-01-07 13:15:30 & only next entry is for the time when node was restarted today at 2015-01-07 13:15 approx.

    So can you also provide "/var/log/messages" output of the same node just to see if anything unusual got reported there,

    Snippet of vertica.log
    ==================
    2015-01-07 13:15:30.021 DistCall Dispatch:0x7f09d801e250-c00000000040a7 [VMPI] <INFO> GetClusterLGE: My local node LGE = 0x12015-01-07 13:15:30.032 DistCall Dispatch:0x7f09d801e250-c00000000040a7 [Txn] <INFO> Rollback Txn: c00000000040a7 'Manage Epochs: Advance AHM'

    2015-01-08 03:25:37.211 INFO New log
    2015-01-08 03:25:37.211 unknown:0x7ff1397b4700 [Init] <INFO> Log /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog/vertica.log opened; #1
    2015-01-08 03:25:37.211 unknown:0x7ff1397b4700 [Init] <INFO> Processing command line: /opt/vertica/bin/vertica -D /data/vertica/catalog/verticabip/v_verticabip_node0001_catalog -C verticabip -n v_verticabip_node0001 -h 15.50.17.123 -p 5433 -P 4803 -Y ipv4
    2015-01-08 03:25:37.211 unknown:0x7ff1397b4700 [Init] <INFO> Starting up Vertica Analytic Database v7.1.1-0
    2015-01-08 03:25:37.211 unknown:0x7ff1397b4700 [Init] <INFO> Project Codename: Dragline
    2015-01-08 03:25:37.211 unknown:0x7ff1397b4700 [Init] <INFO> vertica(v7.1.1-0) built by release@build2.verticacorp.com from releases/VER_7_1_RELEASE_BUILD_1_0_20141016@148158 on 'Thu Oct 16 15:55:09 America/New_York 2014' $BuildId$


    Regards
    Rahul
  • Options
    Here is output from /var/log/messages. Thank you!

    $ cat "/var/log/messages"
    Jan  4 03:44:01 c0049696 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1566" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
    Jan  4 03:44:02 c0049696 rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.
    Jan  5 03:35:03 c0049696 rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.
    Jan  5 16:01:36 c0049696 abrtd: Directory 'ccpp-2015-01-05-16:01:29-15909' creation detected
    Jan  5 16:01:36 c0049696 abrt[1408]: Saved core dump of pid 15909 (/opt/vertica/bin/vertica) to /var/spool/abrt/ccpp-2015-01-05-16:01:29-15909 (1747222528 bytes)
    Jan  5 16:01:36 c0049696 abrtd: Package 'vertica' isn't signed with proper key
    Jan  5 16:01:36 c0049696 abrtd: 'post-create' on '/var/spool/abrt/ccpp-2015-01-05-16:01:29-15909' exited with 1
    Jan  5 16:01:36 c0049696 abrtd: Corrupted or bad directory '/var/spool/abrt/ccpp-2015-01-05-16:01:29-15909', deleting
    Jan  6 03:32:03 c0049696 rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.
    Jan  6 16:08:19 c0049696 abrt[25497]: Saved core dump of pid 30336 (/opt/vertica/bin/vertica) to /var/spool/abrt/ccpp-2015-01-06-16:08:08-30336 (1796984832 bytes)
    Jan  6 16:08:19 c0049696 abrtd: Directory 'ccpp-2015-01-06-16:08:08-30336' creation detected
    Jan  6 16:08:19 c0049696 abrtd: Package 'vertica' isn't signed with proper key
    Jan  6 16:08:19 c0049696 abrtd: 'post-create' on '/var/spool/abrt/ccpp-2015-01-06-16:08:08-30336' exited with 1
    Jan  6 16:08:19 c0049696 abrtd: Corrupted or bad directory '/var/spool/abrt/ccpp-2015-01-06-16:08:08-30336', deleting
    Jan  7 03:39:03 c0049696 rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.
    Jan  7 08:15:56 c0049696 abrtd: Directory 'ccpp-2015-01-07-08:15:47-14410' creation detected
    Jan  7 08:15:56 c0049696 abrt[10249]: Saved core dump of pid 14410 (/opt/vertica/bin/vertica) to /var/spool/abrt/ccpp-2015-01-07-08:15:47-14410 (1837305856 bytes)
    Jan  7 08:15:56 c0049696 abrtd: Package 'vertica' isn't signed with proper key
    Jan  7 08:15:56 c0049696 abrtd: 'post-create' on '/var/spool/abrt/ccpp-2015-01-07-08:15:47-14410' exited with 1
    Jan  7 08:15:56 c0049696 abrtd: Corrupted or bad directory '/var/spool/abrt/ccpp-2015-01-07-08:15:47-14410', deleting
    Jan  7 13:17:06 c0049696 abrtd: Directory 'ccpp-2015-01-07-13:16:58-13385' creation detected
    Jan  7 13:17:06 c0049696 abrt[8116]: Saved core dump of pid 13385 (/opt/vertica/bin/vertica) to /var/spool/abrt/ccpp-2015-01-07-13:16:58-13385 (1830232064 bytes)
    Jan  7 13:17:06 c0049696 abrtd: Package 'vertica' isn't signed with proper key
    Jan  7 13:17:06 c0049696 abrtd: 'post-create' on '/var/spool/abrt/ccpp-2015-01-07-13:16:58-13385' exited with 1
    Jan  7 13:17:06 c0049696 abrtd: Corrupted or bad directory '/var/spool/abrt/ccpp-2015-01-07-13:16:58-13385', deleting
    Jan  8 03:30:02 c0049696 rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file