[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: DMA errors on hard drive, but why?



Here's some potentially "out-there" speculation - and I hope I'm not leading
you down the garden path.

It just *might* be a "signal integrity" issue being introduced by the caddy.
This could be in either the area of "stubs" or "ground returns".  I did this
electrical behavior analysis for 14 years at a previous job.

I apologize to the readership for the length of the explanation, but as
computers/disks push higher in speed this becomes problem becomes more
prevalent so maybe a post like this is due.

High-speed digital signals do *not* like stubs on the transmission line they
are propagating on.  The term transmission line where means a cable wire or
pc-board trace.  In a perfect world there would just be two single pieces of
wire between any "talker" and "listener" - one for the "signal" and the
other the "ground return" path.  That's the ideal.

Let's look an a signal wire on an IDE cable (ignoring ground return issues
at for the moment), shown below:

    C                 S    M
    +-----------------+----+

There is the (C)omputer connector, the (S)lave connector and the (M)aster
connector.  With only one drive (it would be a Master), it needs to go on
end connector.  If you put it on the first drive connector, S, when the
computer sends a signal to the drive when the signal gets to "S" it doesn't
stop (obviously).  It continues to propagate to "M".  When it gets to M
there is no termination and so just about the entire signal is reflected
back towards S and C.  This reflection can raise havoc with the voltage
waveform at S and perhaps cause timing problems so that a logic "1" or "0"
is occasionally misinterpreted by the drive connected to "S".

As it, there is even a bit of a stub at S when there are two drives on the
line.  Let's redraw the diagram this way.

                      S    M
    C                 |    |
    +-----------------+----+

Drive manufacturers try to get the XMT/RCV logic for the drive as close to
the connector as they can to minimize stubbing since this increases the
"reliability" of the drives from a signaling perspective.

When you add a caddy to the scenario you are essentially increasing the
length of the stubs like this:

                      S    M
                      |    |
    C                 |    |
    +-----------------+----+

One way to test that this could be the problem would be to connect each
drive and caddy - one at a time - to the M connector (be sure to make it a
Master drive if it had been the Slave).  If, under heavy access, it runs
reliably, then signal integrity issues may lie at the heart of the problem.
Of course it there is a problem then you have to test different scenarios -
drive_1 with caddy_1, drive_1 with caddy_2, etc.

Next we have the problem of signal returns.  When a voltage is switched an
electrical current flows.  That current must flow back to the source.  It
does this through a "ground return".  Let's be clear - the current *will*
return to the source.  The question is,  will it do so in a manner that does
not negatively effect other itself or any other signal  Again, as signal
speed increases, a point is reached where one return path (wire) per signal
is needed to keep the system running.  Sharing a return wire between
multiple signals becomes a dangerous practice.  Also, if signal wires are
not isolated from each with ground return wires they can generate
"cross-talk" in each other and signals will be degraded.  This is why, for
example, older and slower IDE drives could be run with a 40 wire cable, but
the newer and faster ATA66 and later drives need an 80 wire cable.
Essentially in the 80 wire cable every other wire is a ground return wire
and the signal wires  are isolated from each other.  Note that the
connectors don't have 80 pins.  The ground returns are shorted together to
the correct pints within the connector header.

So... Before you used the caddies, were you running a 40 wire or 80 wire (40
signal/ground pairs) cable?  I would then wonder how adding a caddy might
affect the ground return paths for the signals.  The problem could
potentially lie there.  You need to ask this question, "will the caddy
support ATA66 and faster drives?".

All of this info is sort of second-hand knowledge to people who have set up
SCSI systems.  They are *very* aware of terminated cables and rules to
eliminate stubbing.  And the SCSI specification itself covers how the cables
are manufactured.  With IDE drives the world the picture is more complicated
since it is very easy to mismatch "slow" IDE components with "fast" IDE
components.

Again, apologies to all for the length.
Cheers,
-rick


-----Original Message-----
From: James Green [mailto:jg@jmkg.clara.co.uk]
Sent: Sunday, June 16, 2002 4:24 AM
To: debian-user@lists.debian.org
Subject: DMA errors on hard drive, but why?


Hi all,

I've got two Maxtor IDE hard drives. They have been working fine, but since
I
moved them into drive caddies I've been getting the following messages in
dmesg output for the second drive:

cyberstorm:~# dmesg | grep hdb
    ide0: BM-DMA at 0xdc00-0xdc07, BIOS settings: hda:DMA, hdb:DMA
hdb: MAXTOR 6L080J4, ATA DISK drive
hdb: 156355584 sectors (80054 MB) w/1819KiB Cache, CHS=9732/255/63
 hdb: hdb1 hdb2 hdb3 hdb4 < hdb5 >
hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdb: dma_intr: error=0x84 { DriveStatusError BadCRC }
hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdb: dma_intr: error=0x84 { DriveStatusError BadCRC }
[ loops ]

At this end of this loop DMA for both drives is deactivated. I can
reactivate
it, but as soon as hdb gets some heavy activity the same errors occur.

So I'm wondering what could be the cause. I've tried two different caddies
(both chassis and holder) so it doesn't look like a connection problem
unless
they both suffer, the drives were both fine outside of the caddies, and I've
double checked the IDE cable is firmed in both connectors.

What should I do to diagnose this fault? Many thanks.

James Green


--
To UNSUBSCRIBE, email to debian-user-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact
listmaster@lists.debian.org




-- 
To UNSUBSCRIBE, email to debian-user-request@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: