Strange results in io_usage table

I have been looking at the various informational tables to see if I can identify some performance problems and I was looking at the io_usage table and I am seeing very strange results. The written_kbytes_per_sec always far outweighs the read_kbytes_per_sec. While this might make sense during some small part of the day when I am doing a lot of data loading, further investigation shows that it is always that way throughout all the time that the database is being used (primarily just for select queries): hour total_read total_written 7 0.13 662.39 8 125.73 70,081.81 9 150.37 80,543.02 10 69.04 80,512.73 11 71.29 80,758.04 12 93.7 80,891.69 13 247.98 80,669.46 14 4,551.3 81,476.16 15 216.37 9,596 That doesn't make sense to me, it seems like there would be a whole lot more reading going on than writing. Am I interpreting this data wrong? Why would this be?

Comments

  • Hi Kevin, I believe that table is just physical reads. Is it possible that most of your queries hit data that is buffered in memory? That table also lists all IO on the system, not just from Vertica. Is it possible that you have another process on the system doing a lot of writing? Adam
  • It's possible that it is just hitting data that is buffered in memory, and even if that were the case, it seems strange that it is being written to so often. It almost seems like the numbers are reversed. The cluster this data is from is solely dedicated to running vertica, so it is unlikely it is coming from other processes. I thought maybe there are lot of queries that are spilling to disk when creating intermediate results, but it seems like it would have to be reading from disk to get that info initially.
  • Actually, it's not necessarily strange at all. Linux will always write new file data to disk. So if you're loading data, you will always see disk writes. But the file data that's being written to disk must start off in memory. So why remove it? If you have enough RAM, why not just keep a copy in memory, so there's no need to read it at all? So it's very possible, if you have enough RAM, to write data to disk once and never read it again ever, not even once. The disk is then just a system of record, so that you can restore / read back into memory if the server crashes or something. If you don't believe Vertica's output, you can ask the Linux kernel -- take a look in "/proc/diskstats". If that file consistently disagrees sharply with the io_usage table for you, that would be quite interesting to know.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file