On understanding Linux I/O problems.
Recently I have been running into different issues related to IO
problems on different machines. Diagnosing faulty hardware, bandwidth
limits and IO caused by different problems which took me quite some time
to figure out and I'm still not 100% clear if I've fixed them.
So, I'm asking, what do most sysadmins use to diagnose IO problems?
I've found iostat, pidstat and iotop, which is all fantastic and good,
but still can mystify me as to what the cause could be at times. I find
iotop rather useless as it seems to cache current status, and delay
output which... at times is displayed well after a large IO hit.
I've also played and poked at schedulers, to better understand what/how
IO is handled yet nothing conclusive has been brought to my attention.
Should I look at some other status/benchmark or diagnostics tools to
better understand where some of my bottlenecks lie?
I don't want this to be an open-ended time-sink type of post, the real
answer could simply be a document that addresses some of my questions, I
cannot find much info otherwise on IO and how to best