[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Intel X540-AT2 and Debian: intermittent connection



On Sat, 2022-11-19 at 17:35 -0800, David Christensen wrote:
> On 11/19/22 15:51, hw wrote:
> > On Sat, 2022-11-19 at 13:35 -0800, David Christensen wrote:
> > > On 11/19/22 06:50, hw wrote:
> > > > On Fri, 2022-11-18 at 17:02 -0800, David Christensen wrote:
> > > 
> > > > > ... I suggest trying a Category 6A factory patch cable at least 2
> > > > > meters
> > > > > long.
> > > > 
> > > > I tried it with a 10m cat6 cable and the connection was intermittent. 
> > > > It's
> > > > the
> > > > same (as in "identical to") cable that works between the other server
> > > > and a
> > > > client.
> > > 
> > > 
> > > Okay.  I suggest putting a unique mark/ serial number on each cable for
> > > tracking purposes until you resolve the intermittent connection issue.
> > 
> > What for?  All the cables I used except for the new ones are known good.
> 
> 
> Sanity check/ OCD.  I went through a period with SATA III drive problems 
> and marked all of my SATA cables to help with troubleshooting.

Hm I can see that for when you have so many of them that they're hard to tell
apart.

> [...]
> > > Perhaps that is a good reason to do some devops development -- e.g.
> > > write a data-driven script that reads a configuration file to
> > > interconnect the VM virtual network interfaces and host physical network
> > > interfaces.
> > 
> > Why would I do that?  How would a script figure out which interface is which
> > and
> > how would it guarantee that they will be exactly the same as seen by
> > OPNsense
> > running in that VM when I switch them around?  I'm not saying it's
> > impossible,
> > but I'd rather resolve this problem in a timely manner and not in a couple
> > years
> > when I might have finished the script and tested it in a bunch of servers
> > which
> > aren't even relevant.
> 
> 
> I agree that creating software for devops can be difficult and time 
> consuming, but it is nice to have when done.  I have built up a 
> collection of shell and Perl scripts over the years that are very useful.

I do that when it makes sense, not when it doesn't.  You should try to set up a
VM with OPNsense and a couple network cards you have to pass through so you see
how much fun that is.  It took me a day or two to get it to work stable, and it
was a one-time endevour.  You'd have to have some robot arm to pull the server
from the rack, take off the cover (requires two arms maybe) and have them move
the cards in the PCI slots, controlled by an AI that's acutally smart enough to
understand what it's doing and able to do the testing as well.  Good luck with
programming that :)

> > > > I suspect it's a mainboard issue.
> 
> 
> The clues support that hypothesis.

Or the card is broken.  At least Intel makes cards that appear to behave
somewhat consistently even when they don't work ;)
> > 

> [...]
> > > Did you try the d-i rescue shell
> > 
> > You mean the rescue system that comes with the Debian installer?  No, I
> > haven't.
> > How would that make a difference?
> 
> 
> It would provide data point for troubleshooting.

The LEDs on the cards don't come when the rescue system is running and it
doesn't work at all.

> > > or any live sticks?
> > 
> > only the Fedora one
> 
> 
> That indicates a bad NIC and/or a bad PCIe slot.

Right, FreeBSD also makes an intermittent connection.

> 
> > > > I plugged the Intel card back in and the on-card
> > > > worked again.  I'd try disabling the on-board card but there is no
> > > > option to
> > > > do
> > > > that in the BIOS.
> > > 
> > > Okay.  That indicates the issue is software.
> > 
> > How would it be a software issue affecting a network card from a totally
> > different manufacturer in a PCI slot that the BIOS doesn't have an option to
> > disable the on-board network card?
> 
> 
> Without extensive engineering information and the right test equipment, 
> who knows?

Something that isn't there can't have an effect ...

> > 
> [...]
> It sounds like you could use more spare parts and/or computers.

I already have too many.

> Let us know what happens with the Broadcom card and with whatever BSD 
> you pick (the FreeBSD installer includes a rescue shell and a live system).

ok


Reply to: