Re: mii-diag returns strange errors
Wojciech Ziniewicz wrote:
But I'll change the ethernet cable .. there were some servers mounted
during last week that could generate some noise or anyway they could
just pull the cable in an accident or something...
More or less - the server should not report so many errors on the link
and i'll trace this problem down...
To share one other thing that's helped me in the past... sometimes it's
a good idea to reset the counters on both sides to zero and watch it
"now" -- your counters have probably been running since your last
reboot, and you don't really know when the errors occurred.
A "baseline" look at the numbers sometimes shows there's no problem at
all, right this minute, and by watching carefully you can see if there's
a pattern in time-of-day, etc.
There are various methods of automating this also, of course... an idea
for a future script or something bigger and more complex like say maybe,
nagios... or other similar network/server monitoring software.
Since you have "active" reports of problems right now, I'd say that the
other servers or accidental disconnects aren't your problem, since it's
ongoing. But you'd know best how long the reports take to make it to
you from the end-users...
Really heavy I/O could also be involved here. An example: One thing
that will stress any Ethernet pipe to the limit is over-the-network
backups... thus the usual desire to deploy a SAN (storage area network)
to keep that traffic away from "live" traffic.
Hi levels of I/O or even CPU on the server could even be causing
problems like this one... it's all dependent on how bad the load is and
what cascading effects that load is having on subsystems all the way
down to how the kernel is talking to your Ethernet cards.
These comments are all basically a reminder that there's a "big picture"
to watch, and it sounds like you're already very alertly watching for
other things that could be helping or hindering. Some people forget.
After a while, it becomes more intuitive, if you haven't been doing it
for a long time. Your brain and your instincts remind you that someone
made a change on X date, and that change is probably the reason for the
"other" problems you're seeing in unexpected places.