Hunting for cause of high disk read rate
Sporadically our 6.1.1 cluster of 12 nodes hammers disk, reading about 2G per second. Episodes last for about two hours. During these episodes, query rates are cut in half.
This chart shows recent occurrences.
Casual reading of vertica.log reveals no unusual queries, no recovery events or exceptional moveouts/mergeouts.
I am looking for troubleshooting techniques.
This chart shows recent occurrences.
Casual reading of vertica.log reveals no unusual queries, no recovery events or exceptional moveouts/mergeouts.
I am looking for troubleshooting techniques.
0
Comments
27/12 - 19:00
30/12 - 19:00
07/01 - 19:00
09/01 - 19:00
If its not some scheduled process, how can you explain hour 19:00? Can you apply FFT on data and tell us what is a time period or frequency? Im pretty sure its some oscillation.
We don't use MC, so I stopped vertica_agent. There have been no high disk IO incidents for three days since stopping vertica_agent.
You can start and stop and re-verify if vertica_agent is causing this high disk IO incidents.
/etc/init.d/vertica_agent start
/etc/init.d/vertica_agent stop
What activity was reported in agent.log file?
I'm stumped as to why.
When automatic audit of database size occurs? What is a time of your cron job - 18:45? Blue color - its IO reads only? Audit do a massive IO reads on large databases.
Some tips(in case it reemerged):
1. Increase a verbosity for logging: 2. blktrace - generate traces of the i/o traffic on block devices
http://www.mimuw.edu.pl/~lichota/09-10/Optymalizacja-open-source/Materialy/10%20-%20Dysk/gelato_ICE0...
http://prefetch.net/blog/index.php/2009/02/16/tracing-block-io-operations-on-linux-hosts-with-blktra...
3. iotop - simple top-like I/O monitor
iotop should show you what(witch a process/thread) do a massive IO reads.
Regards.
The table projections are ordered and segmented by d_id and i_id. The table has a little over 7T rows.
What about deletes for "some_table"? Delete vectors can slow a query. By docs its not recommended more than 10% of deleted data, from my expicrience this number is smaller.