[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#657689: e1000e Detected Hardware Unit Hang:



On two different machines running two different intel cards, I'm seeing a very similar issue. One machine has the Intel 82572EI and the other is 82571EB.  After installing irqbalance on both machines, the messages have gone away for _only_ the 82571EB machine (before it was multiple times daily).

Here is the last message from the 82571EB machine:

May 23 15:59:17 node01 kernel: [36728.009353] e1000e 0000:04:00.0: eth2: Detected Hardware Unit Hang:
May 23 15:59:17 node01 kernel: [36728.009355]   TDH                  <39>
May 23 15:59:17 node01 kernel: [36728.009356]   TDT                  <3b>
May 23 15:59:17 node01 kernel: [36728.009357]   next_to_use          <3b>
May 23 15:59:17 node01 kernel: [36728.009357]   next_to_clean        <39>
May 23 15:59:17 node01 kernel: [36728.009358] buffer_info[next_to_clean]:
May 23 15:59:17 node01 kernel: [36728.009359]   time_stamp           <1008af3ae>
May 23 15:59:17 node01 kernel: [36728.009359]   next_to_watch        <39>
May 23 15:59:17 node01 kernel: [36728.009360]   jiffies              <1008af4f9>
May 23 15:59:17 node01 kernel: [36728.009361]   next_to_watch.status <0>
May 23 15:59:17 node01 kernel: [36728.009362] MAC Status             <80383>
May 23 15:59:17 node01 kernel: [36728.009362] PHY Status             <792d>
May 23 15:59:17 node01 kernel: [36728.009363] PHY 1000BASE-T Status  <7800>
May 23 15:59:17 node01 kernel: [36728.009364] PHY Extended Status    <3000>
May 23 15:59:17 node01 kernel: [36728.009365] PCI Status             <10>


And from the 82572EI machine (still seeing the messages on this machine):

May 26 16:09:27 host02 kernel: [2389975.056960] e1000e 0000:07:00.0: eth2: Detected Hardware Unit Hang:
May 26 16:09:27 host02 kernel: [2389975.056961]   TDH                  <38>
May 26 16:09:27 host02 kernel: [2389975.056962]   TDT                  <3b>
May 26 16:09:27 host02 kernel: [2389975.056963]   next_to_use          <3b>
May 26 16:09:27 host02 kernel: [2389975.056963]   next_to_clean        <38>
May 26 16:09:27 host02 kernel: [2389975.056964] buffer_info[next_to_clean]:
May 26 16:09:27 host02 kernel: [2389975.056965]   time_stamp           <1239be208>
May 26 16:09:27 host02 kernel: [2389975.056966]   next_to_watch        <38>
May 26 16:09:27 host02 kernel: [2389975.056966]   jiffies              <1239be30c>
May 26 16:09:27 host02 kernel: [2389975.056967]   next_to_watch.status <0>
May 26 16:09:27 host02 kernel: [2389975.056968] MAC Status             <80383>
May 26 16:09:27 host02 kernel: [2389975.056968] PHY Status             <792d>
May 26 16:09:27 host02 kernel: [2389975.056969] PHY 1000BASE-T Status  <7800>
May 26 16:09:27 host02 kernel: [2389975.056970] PHY Extended Status    <3000>
May 26 16:09:27 host02 kernel: [2389975.056970] PCI Status             <10>


Searching around, this may already be fixed elsewhere.  See the following reports:

https://bugzilla.redhat.com/show_bug.cgi?id=785806

http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg04658.html


Let me know if you need more specific information related to my systems.

--
Matth


Reply to: