[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

issues with ext3 on lvm2 on md raid1 on sarge



Hello,

I am seeing issues while using an ext3 filesystem on top of lvm2 on top of a md raid1 device.

My system is running sarge with:

kernel-image-2.6.8-2-386        2.6.8-16sarge1
lvm2    2.01.04-5
mdadm   1.9.0-4sarge1
bonnie++        1.03a
jfsutils        1.1.7-1
e2fsprogs       1.37-2sarge1

The raid1 device is composed of two 250GB SATA drives on a Silicon Image 3112 based controller.

I created the raid1 device using the following.

localhost:~# mdadm -C /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1

After waiting for the array to build, I created the lvm volumes and some filesystems.

localhost:~# pvcreate /dev/md0
localhost:~# vgcreate raid1 /dev/md0
localhost:~# lvcreate -L75G -n ext3filesystem raid1
localhost:~# lvcreate -L25G -n jfsfilesystem raid1

localhost:~# mkfs.ext3 /dev/raid1/ext3filesystem
localhost:~# mkfs.jfs /dev/raid1/jfsfilesystem

localhost:~# mkdir /mnt/ext3filesystem
localhost:~# mkdir /mnt/jfsfilesystem
localhost:~# mount /dev/raid1/ext3filesystem /mnt/ext3filesystem
localhost:~# mount /dev/raid1/jfsfilesystem /mnt/jfsfilesystem

I then decided to run some performance tests on each of the filesystems using bonnie++.

localhost:~# bonnie++ -d /mnt/jfsfilesystem/ -u 1000
localhost:~# bonnie++ -d /mnt/ext3filesystem/ -u 1000

The jfs filesystem completed the test without any problems, but everytime I run the test on the ext3 filesystem the test fails midway through when the filesystem is remounted read only. The following errors show up in dmesg

EXT3-fs error (device dm-2): ext3_free_blocks: bit already cleared for block 278821
Aborting journal on device dm-2.
EXT3-fs error (device dm-2): ext3_free_blocks: bit already cleared for block 278822
ext3_abort called.
EXT3-fs abort (device dm-2): ext3_journal_start: Detected aborted journal
Remounting filesystem read-only
ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device dm-2) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device dm-2) in ext3_truncate: Journal has aborted
ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device dm-2) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device dm-2) in ext3_orphan_del: Journal has aborted
ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device dm-2) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device dm-2) in ext3_delete_inode: Journal has aborted


I also ran bonnie++ on the / filesystem which is ext3 but does not use md raid or lvm.

localhost:~# bonnie++ -d /tmp/ -u 1000

This completed without any problems.

While searching about the problem I came across the following. Debian bug #295657 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=295657 is very similar to what I am experiencing. The bug was closed after the problem was determined to be an issue with defective RAM. I also came across this redhat bug https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=152162 which was determined to have been introduced in the kernel around 2.6.10

To rule out defective RAM I ran memtest86+ 1.65 on the machine overnight. It successfully completed 24 passes without any errors in about 12.5 hours, so I think that my RAM is good. It does not appear to be a hard drive issue since the tests complete fine on the jfs filesystem.

Are there any other hardware tests that I should try or is this likely to be a software bug?

Mike



Reply to: