Bug#657689: e1000e Detected Hardware Unit Hang:
W dniu 28 stycznia 2012 23:04 użytkownik ubik pl <ubik.pl@gmail.com> napisał:
> W dniu sobota, 28 stycznia 2012 użytkownik Ben Hutchings
> <ben@decadent.org.uk> napisał:
>
>> On Sat, 2012-01-28 at 09:56 +0100, ubik pl wrote:
>>> 2012/1/28 Ben Hutchings <ben@decadent.org.uk>:
>>> > On Sat, 2012-01-28 at 01:53 +0100, ubik pl wrote:
>>> >> Package: linux-2.6
>>> >> Version: 2.6.32-38
>>> >> Linux version 2.6.32-5-xen-amd64
>>> >> RAM 8GB
>>> >>
>>> >> Bug seems the same as #518182 and occurs on high traffic load.
>>> >> System automatically recovered after aprox. 50
>>> >> minutes([1782786.812335] e1000e 0000:00:19.0: eth0: Reset adapter)
>>> > [...]
>>> >
>>> > Why do say 50 minutes? These messages cover a span of only 7 seconds.
>>> > Were they repeated?
>>> >
>>> > Ben.
>>> >
>>> No they weren't but after 50 minutes network was accessible again(ssh,
>>> ping etc)
>>
>> Are you saying it was not accessible for that time, or you don't know
>> whether it was? Did you see anything logged during that period?
>>
>> Ben.
>
> It was not accessible. This is what Nagios is for.
> I don't have any serial console etc, so I don't know what is logged when my
> machine is down.
The same situation after a month. Backup starts at 23:40, 15 minutes
later system becomes inaccessible for ~41 minutes and
automatically recovers, kern.log contains:
Feb 27 00:32:02 v0 kernel: [4374221.812265] e1000e 0000:00:19.0: eth0:
Detected Hardware Unit Hang:
Feb 27 00:32:02 v0 kernel: [4374221.812267] TDH <8f>
Feb 27 00:32:02 v0 kernel: [4374221.812269] TDT <91>
Feb 27 00:32:02 v0 kernel: [4374221.812270] next_to_use <91>
Feb 27 00:32:02 v0 kernel: [4374221.812271] next_to_clean <8f>
Feb 27 00:32:02 v0 kernel: [4374221.812272] buffer_info[next_to_clean]:
Feb 27 00:32:02 v0 kernel: [4374221.812273] time_stamp <1412d2e8b>
Feb 27 00:32:02 v0 kernel: [4374221.812274] next_to_watch <8f>
Feb 27 00:32:02 v0 kernel: [4374221.812275] jiffies <1412d3006>
Feb 27 00:32:02 v0 kernel: [4374221.812276] next_to_watch.status <0>
Feb 27 00:32:02 v0 kernel: [4374221.812277] MAC Status <80280>
Feb 27 00:32:02 v0 kernel: [4374221.812279] PHY Status <ffff>
Feb 27 00:32:02 v0 kernel: [4374221.812280] PHY 1000BASE-T Status <ffff>
Feb 27 00:32:02 v0 kernel: [4374221.812281] PHY Extended Status <ffff>
Feb 27 00:32:02 v0 kernel: [4374221.812282] PCI Status <10>
Feb 27 00:32:04 v0 kernel: [4374223.812265] e1000e 0000:00:19.0: eth0:
Detected Hardware Unit Hang:
Feb 27 00:32:04 v0 kernel: [4374223.812267] TDH <8f>
Feb 27 00:32:04 v0 kernel: [4374223.812268] TDT <91>
Feb 27 00:32:04 v0 kernel: [4374223.812269] next_to_use <91>
Feb 27 00:32:04 v0 kernel: [4374223.812271] next_to_clean <8f>
Feb 27 00:32:04 v0 kernel: [4374223.812272] buffer_info[next_to_clean]:
Feb 27 00:32:04 v0 kernel: [4374223.812273] time_stamp <1412d2e8b>
Feb 27 00:32:04 v0 kernel: [4374223.812274] next_to_watch <8f>
Feb 27 00:32:04 v0 kernel: [4374223.812275] jiffies <1412d31fa>
Feb 27 00:32:04 v0 kernel: [4374223.812276] next_to_watch.status <0>
Feb 27 00:32:04 v0 kernel: [4374223.812277] MAC Status <80280>
Feb 27 00:32:04 v0 kernel: [4374223.812278] PHY Status <ffff>
Feb 27 00:32:04 v0 kernel: [4374223.812279] PHY 1000BASE-T Status <ffff>
Feb 27 00:32:04 v0 kernel: [4374223.812280] PHY Extended Status <ffff>
Feb 27 00:32:04 v0 kernel: [4374223.812282] PCI Status <10>
Feb 27 00:32:06 v0 kernel: [4374225.812243] e1000e 0000:00:19.0: eth0:
Detected Hardware Unit Hang:
Feb 27 00:32:06 v0 kernel: [4374225.812246] TDH <8f>
Feb 27 00:32:06 v0 kernel: [4374225.812247] TDT <91>
Feb 27 00:32:06 v0 kernel: [4374225.812248] next_to_use <91>
Feb 27 00:32:06 v0 kernel: [4374225.812249] next_to_clean <8f>
Feb 27 00:32:06 v0 kernel: [4374225.812250] buffer_info[next_to_clean]:
Feb 27 00:32:06 v0 kernel: [4374225.812251] time_stamp <1412d2e8b>
Feb 27 00:32:06 v0 kernel: [4374225.812252] next_to_watch <8f>
Feb 27 00:32:06 v0 kernel: [4374225.812253] jiffies <1412d33ee>
Feb 27 00:32:06 v0 kernel: [4374225.812255] next_to_watch.status <0>
Feb 27 00:32:06 v0 kernel: [4374225.812256] MAC Status <80280>
Feb 27 00:32:06 v0 kernel: [4374225.812257] PHY Status <ffff>
Feb 27 00:32:06 v0 kernel: [4374225.812258] PHY 1000BASE-T Status <ffff>
Feb 27 00:32:06 v0 kernel: [4374225.812259] PHY Extended Status <ffff>
Feb 27 00:32:06 v0 kernel: [4374225.812260] PCI Status <10>
Feb 27 00:32:08 v0 kernel: [4374227.812060] e1000e 0000:00:19.0: eth0:
Reset adapter
Feb 27 00:32:08 v0 kernel: [4374228.128319] br0: port 1(eth0) entering
disabled state
Feb 27 00:32:10 v0 kernel: [4374230.272868] e1000e: eth0 NIC Link is
Up 1000 Mbps Full Duplex, Flow Control: None
Feb 27 00:32:10 v0 kernel: [4374230.273195] br0: port 1(eth0) entering
learning state
Feb 27 00:32:25 v0 kernel: [4374245.272012] br0: port 1(eth0) entering
forwarding state
--
Reply to: