Vertica takes a long time to start

Hi All,

I was worried about performing a snapshot while vertica running, then I decided stopping it.

Should I have to stop vertica to performe VM snapshots?

Vertica is taking a long time to become available, last time I restarted it took for about 45min to became available.

Follows below vertica.log content:

2017-02-23 12:08:57.021 unknown:0x7f0320a97700 [Txn] Catalog sequence set, seq num=a6de39d1, nodeID=a
2017-02-23 12:08:57.021 unknown:0x7f0320a97700 [Txn] Found my node (v_drdata_node0001) in the catalog
2017-02-23 12:08:57.021 unknown:0x7f0320a97700 [Txn] Catalog info: version=0x2bea3ec, number of nodes=1, permanent #=1, K=0
2017-02-23 12:08:57.022 unknown:0x7f0320a97700 [Txn] Catalog info: current epoch=0x19eaf38
2017-02-23 12:08:57.207 unknown:0x7f0320a97700 [Catalog] Catalog OID generator updated based on LOCAL tier catalog
2017-02-23 12:08:57.341 unknown:0x7f0320a97700 [Init] Catalog loaded
2017-02-23 12:08:57.568 unknown:0x7f0320a97700 [Comms] About to launch spread with '/opt/vertica/spread/sbin/spread -c /app/catalog/drdata/v_drdata_node0001_catalog/spread.conf'
2017-02-23 12:08:57.576 unknown:0x7f0320a97700 [Init] Listening on port: 5433
2017-02-23 12:08:57.577 unknown:0x7f0320a97700 [Init] About to fork
2017-02-23 12:08:57.584 unknown:0x7f0320a97700 [Init] About to fork again
2017-02-23 12:08:57.591 unknown:0x7f0320a97700 [Init] Completed forking
2017-02-23 12:08:57.592 unknown:0x7f0320a97700 [Init] Startup [Connecting to Spread] Connecting to spread 4803
2017-02-23 12:08:58.726 unknown:0x7f0320a97700 [Comms] connected to spread as #node_a#N172023201160
2017-02-23 12:08:58.726 unknown:0x7f0320a97700 [Init] PID=1722
2017-02-23 12:08:58.726 unknown:0x7f0320a97700 [Init] Start reading DataCollector information
2017-02-23 12:08:58.730 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 0 / 1179128391
2017-02-23 12:12:21.180 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 124070325 / 1179128391
2017-02-23 12:14:15.501 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 245356234 / 1179128391
2017-02-23 12:16:16.708 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 382377173 / 1179128391
2017-02-23 12:18:33.052 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 502611093 / 1179128391
2017-02-23 12:21:17.771 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 621498606 / 1179128391
2017-02-23 12:23:14.466 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 739949464 / 1179128391
2017-02-23 12:25:04.137 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 857987793 / 1179128391
2017-02-23 12:27:35.929 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 986765394 / 1179128391
2017-02-23 12:28:38.222 unknown:0x7f0320a97700 [Init] Startup [Read DataCollector] Inventory files (bytes) - 1105188507 / 1179128391
2017-02-23 12:28:46.081 unknown:0x7f0320a97700 [Init] Done reading DataCollector information
2017-02-23 12:28:46.082 Main:0x6506fb0 @v_drdata_node0001: 57V03/4973: The UDx zygote process is down, restarting it...
2017-02-23 12:28:47.019 Main:0x6506fb0 [Init] Setting up Messenger service
2017-02-23 12:28:47.020 Main:0x6506fb0 [Dist] Cluster point-point listener port 5434
2017-02-23 12:28:47.021 Main:0x6506fb0 @v_drdata_node0001: 00000/3885: Locking [/app/catalog/drdata/v_drdata_node0001_catalog/vertica.pid], fdLocked == 13
2017-02-23 12:28:47.055 Main:0x6506fb0 [Init] Using "Brazil/East" as the default timezone
2017-02-23 12:28:47.056 Main:0x6506fb0 [Txn] Begin Txn: fff0000000000cc1 'Initialization'
2017-02-23 12:28:47.057 Main:0x6506fb0 [Catalog] Running SINGLE upgrade tasks v7.0.2-5(149346) -> v7.0.2-5(149346)
2017-02-23 12:28:47.057 Main:0x6506fb0 [Catalog] License Audit Record Hash Check Passed.
2017-02-23 12:28:47.057 Main:0x6506fb0 [Catalog] Full License Hash Check Passed.
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool general - Queries: 10000 Threads: 8192 File Handles: 39263 Memory(KB): 9961926
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool sysquery - Queries: 10000 Threads: 8245 File Handles: 39518 Memory(KB): 10027462
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool sysdata - Memory(KB): 1048576
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool wosdata - Memory(KB): 4194304
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool tm - Queries: 3 Threads: 8358 File Handles: 40061 Memory(KB): 10166726
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool refresh - Queries: 10000 Threads: 8192 File Handles: 39263 Memory(KB): 9961926
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool recovery - Queries: 3 Threads: 8192 File Handles: 39263 Memory(KB): 9961926
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool dbd - Queries: 10000 Threads: 8192 File Handles: 39263 Memory(KB): 9961926
2017-02-23 12:28:47.057 Main:0x6506fb0 [ResourceManager] pool jvm - Queries: 10000 T

Comments

  • [Deleted User][Deleted User] Administrator

    From the logs uploaded Vertica spent twenty minutes between 12.08 to 12.28 reading data collector. Have you significantly increased retention policy for data collector ? if yes, that may be something to look into .

  • I am also facing same issue, did the solution of looking into retention policy for data collector worked?

  • [Deleted User][Deleted User] Administrator

    You want to check startup.log in catalog directory(same place where one can find vertica.log) for more clues on why recovery is taking time . Two known reasons could be large catalog with millions of ROS files or large data collector retention policies. I would start with startup.log

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file