[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: mii-diag returns strange errors



Wojciech Ziniewicz wrote:

But I'll change the ethernet cable .. there were some servers mounted
during last week that could generate some noise or anyway they could
just pull the cable in an accident or something...
More or less - the server should not report so many errors on the link
and i'll trace this problem down...

Understood,

To share one other thing that's helped me in the past... sometimes it's a good idea to reset the counters on both sides to zero and watch it "now" -- your counters have probably been running since your last reboot, and you don't really know when the errors occurred.

A "baseline" look at the numbers sometimes shows there's no problem at all, right this minute, and by watching carefully you can see if there's a pattern in time-of-day, etc.

There are various methods of automating this also, of course... an idea for a future script or something bigger and more complex like say maybe, nagios... or other similar network/server monitoring software.

Since you have "active" reports of problems right now, I'd say that the other servers or accidental disconnects aren't your problem, since it's ongoing. But you'd know best how long the reports take to make it to you from the end-users...

Really heavy I/O could also be involved here. An example: One thing that will stress any Ethernet pipe to the limit is over-the-network backups... thus the usual desire to deploy a SAN (storage area network) to keep that traffic away from "live" traffic.

Hi levels of I/O or even CPU on the server could even be causing problems like this one... it's all dependent on how bad the load is and what cascading effects that load is having on subsystems all the way down to how the kernel is talking to your Ethernet cards.

These comments are all basically a reminder that there's a "big picture" to watch, and it sounds like you're already very alertly watching for other things that could be helping or hindering. Some people forget.

After a while, it becomes more intuitive, if you haven't been doing it for a long time. Your brain and your instincts remind you that someone made a change on X date, and that change is probably the reason for the "other" problems you're seeing in unexpected places.

Nate



Reply to: