Re: Corrupt data - RAID sata_sil 3114 chip
Dave Jones wrote:
On Mon, Jan 12, 2009 at 10:30:42AM +0900, Tejun Heo wrote:
> Robert Hancock wrote:
> >> There are apparently some reports of issues on NVidia chipsets as
> >> well, though I don't have any details at hand.
> > Well, Carlos' email bounces, so much for that one. Anyone have any other
> > contacts at Silicon Image?
> I'll ping my SIMG contacts but I've pinged about this problem in the
> past but it didn't get anywhere.
I wish I'd read this thread last week.. I've been beating my head
against this problem all weekend.
I picked up a cheap 3114 card, and found that when I created a filesystem
with it on a 250GB disk, it got massive corruption very quickly.
My experience echos most the other peoples in this thread, but here's
a few data points I've been able to figure out..
I ran badblocks -v -w -s on the disk, and after running
for nearly 24 hours, it reported a huge number of blocks
failing at the upper part of the disk.
I created a partition in this bad area to speed up testing..
Device Boot Start End Blocks Id System
/dev/sde1 1 30000 240974968+ 83 Linux
/dev/sde2 30001 30200 1606500 83 Linux
/dev/sde3 30201 30401 1614532+ 83 Linux
Rerunning badblocks on /dev/sde2 consistently fails when
it gets to the reading back 0x00 stage.
(Somehow it passes reading back 0xff, 0xaa and 0x55)
I was beginning to suspect the disk may be bad, but when I
moved it to a box with Intel sata, the badblocks run on that
same partition succeeds with no problems at all.
Given the corruption happens at high block numbers, I'm wondering
if maybe there's some kind of wraparound bug happening here.
(Though why only the 0x00 pattern fails would still be a mystery).
Yeah, that seems a bit bizarre.. Apparently somehow zeros are being
converted into non-zero.. Can you try zeroing out the partition by
dd'ing into it from /dev/zero or something, then dumping it back out to
see what kind of data is showing up?
After reading about the firmware update fixing it, I thought I'd
give that a shot. This was pretty much complete fail.
The DOS utility for flashing claims I'm running BIOS 5.0.39,
which looking at http://www.siliconimage.com/support/searchresults.aspx?pid=28&cat=15
is quite ancient. So I tried the newer ones.
Same experience with both 22.214.171.124, and 5.0.73
"BIOS version in the input file is not a newer version"
Forcing it to write anyway gets..
"Data is different at address 65f6h"