[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Fwd: xfs_force_shutdown(md(9,0),0x8) called from line 1070 of file xfs_trans.c.



On Thu, 2004-01-22 at 14:58, alberto wrote:
> 
> -----
> Dec  3 15:55:34  machine sshd(pam_unix)[1791]: session opened for
> user xxxx by (uid=500)
> Dec  3 15:56:08 machine kernel: 0x0: 58 41 47 46 00 00 00 01 00 00 00
> 04 00 10 00 00
> Dec  3 15:56:08 machine kernel: xfs_force_shutdown(md(9,0),0x8)
> called from line 1070 of file xfs_trans.c.  Return address =
> 0xf8a23aa8
> Dec  3 15:56:08 machine  kernel: Filesystem "md(9,0)": Corruption of
> in-memory data detected.  Shutting down filesystem: md(9,0)
> Dec  3 15:56:08 machine kernel: Please umount the filesystem, and
> rectify the problem(s)
> ------
> 
> What is the nature of this problem? Kernel drivers?  Hardware
> failure? Filesystem inconsistency?
> 

the "md(9,0)" I believe refers to a specific disk in your array (any
experts out there, please correct me). You may want to compare logs from
the various times this happens to see if the same disk is fingered each
time.

If the issue is hardware, then a) most likely the disk referred to in
the logs will be the same in all instances or b) the RAID card itself
could be acting up. In either, case swapping out the suspected hardware
(only possible in raid 1, 5 or similar setups, i suppose) should stop
the problems.

My guess is that the problem is with the kernel driver rather than
hardware. The kernel message seems to indicate that the buffered
filesystem data (changes to the filesystem that have yet to be written
to the disk) has been corrupted. If the error came when reading data
from the disk, which could mean buggy kernel drivers too, but could also
be hardware problems.

the line in the kernel driver which is refered to in your log:

       /*
         * See if the caller is relying on us to shut down the
         * filesystem.  This happens in paths where we detect
         * corruption and decide to give up.
         */
        if ((tp->t_flags & XFS_TRANS_DIRTY) &&
            !XFS_FORCED_SHUTDOWN(tp->t_mountp))
                xfs_force_shutdown(tp->t_mountp, XFS_CORRUPT_INCORE);

to me, it looks like code to handle the problem but isnt the source of
the problems itself, but I'm no kernel hacker. Perhaps some others can
read more into this than I?

Filesystem inconsistancy, I think, would just be a symptom of one of the
other two possibilities mentioned.


> What can be a possible solution?
> 

Swap out bad hardware, try a newer version of the xfs driver.


dunno if that helps any, but there you have my $0.02

-davidc



Reply to: