[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#410204: linux-image-2.6.18-4-amd64: Data corruption on dm-crypt+XFS



On Thu, Feb 08, 2007 at 05:11:32PM +0200, Sami Liedes wrote:
> XFS, but that triggers it easily and often. A fix was merged upstream
> in 2.6.18.6 ("[PATCH] dm crypt: Fix data corruption with dm-crypt over
> RAID5"), but is not apparently included in the Debian kernel (or at
> least I ran into this with a very similar backtrace). See:

Hmm, seems it (the entire 2.6.18.6) IS included in the Debian kernel.
I wonder which fix is missing then, or if the bug is still in the
vanilla kernel tree. Here's the oops:

------------------------------------------------------------
Feb  8 04:43:08 lh kernel: Filesystem "dm-7": Disabling barriers, not supported by the underlying device
Feb  8 04:43:08 lh kernel: XFS mounting filesystem dm-7
Feb  8 04:43:08 lh kernel: Ending clean XFS mount for filesystem: dm-7
Feb  8 04:46:10 lh kernel: Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
Feb  8 04:46:10 lh kernel:  [<ffffffff802a749a>] page_to_pfn+0x0/0x33
Feb  8 04:46:10 lh kernel: PGD 24a6c067 PUD 1da31067 PMD 0
Feb  8 04:46:10 lh kernel: Oops: 0000 [1] SMP
Feb  8 04:46:10 lh kernel: CPU 0
Feb  8 04:46:10 lh kernel: Modules linked in: sha256 aes dm_crypt snd_intel8x0 xfs ipt_owner ipt_REJECT xt_state xt_tcpudp iptable_filter ipt_MASQUERADE iptable_nat ip_nat ip_conntrack nfnetlink ip_tables x_tables radeon drm binfmt_misc freq_table ppdev lp button ac battery ipv6 nls_iso8859_1 nls_cp437 vfat fat ext2it87 hwmon_vid i2c_isa eeprom usbmouse ide_cd cdrom tsdev snd_ac97_codec snd_ac97_bus snd_opl3_lib snd_pcm_oss snd_mixer_oss snd_hwdep snd_mpu401 snd_mpu401_uart i2c_nforce2 snd_rawmidi snd_seq_device analog i2c_core parport_pc parport snd_pcm snd_timer psmouse serio_raw snd snd_page_alloc gameport evdev floppy soundcore pcspkr ext3 jbd mbcache dm_mirror dm_snapshot dm_mod ide_generic sd_mod ide_disk sata_nv libata scsi_mod 3c59x mii forcedeth generic amd74xx ide_core ehci_hcd ohci_hcd thermal processor fan
Feb  8 04:46:10 lh kernel: Pid: 198, comm: pdflush Not tainted 2.6.18-4-amd64 #1
Feb  8 04:46:10 lh kernel: RIP: 0010:[<ffffffff802a749a>]  [<ffffffff802a749a>] page_to_pfn+0x0/0x33
Feb  8 04:46:10 lh kernel: RSP: 0018:ffff81003e7e97d8  EFLAGS: 00010297
Feb  8 04:46:10 lh kernel: RAX: 0000000000000000 RBX: ffff81000bce2640 RCX: 0000000000000000
Feb  8 04:46:10 lh kernel: RDX: 0000000000000056 RSI: ffff81000bce2640 RDI: 0000000000000000
Feb  8 04:46:10 lh kernel: RBP: ffff81003b3c8000 R08: 0000000000000000 R09: ffff810037ade870
Feb  8 04:46:10 lh kernel: R10: 0000000000000000 R11: ffff81000c1a1ec0 R12: ffff81000bce2640
Feb  8 04:46:10 lh kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff81003e8f8088
Feb  8 04:46:10 lh kernel: FS:  00002b4d40df3d20(0000) GS:ffffffff80521000(0000) knlGS:00000000f7b446c0
Feb  8 04:46:10 lh kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Feb  8 04:46:10 lh kernel: CR2: 0000000000000000 CR3: 000000001e0c6000 CR4: 00000000000006e0
Feb  8 04:46:10 lh kernel: Process pdflush (pid: 198, threadinfo ffff81003e7e8000, task ffff810037ade870)
Feb  8 04:46:10 lh kernel: Stack:  ffffffff8022bf96 ffff810037ade870 000000000000d400 0000000000000000
Feb  8 04:46:10 lh kernel:  ffff810000000001 0000000000000001 ffff81000bce2640 ffff81003e8f8088
Feb  8 04:46:10 lh kernel:  ffff8100192517c0 ffff810007f997a8 0000000000000056 000000000002a000
Feb  8 04:46:10 lh kernel: Call Trace:
Feb  8 04:46:10 lh kernel:  [<ffffffff8022bf96>] blk_recount_segments+0x7e/0x21b
Feb  8 04:46:10 lh kernel:  [<ffffffff802bb9ae>] __bio_clone+0x71/0x8a
Feb  8 04:46:10 lh kernel:  [<ffffffff802bb9fc>] bio_clone+0x35/0x3d
Feb  8 04:46:10 lh kernel:  [<ffffffff8822776a>] :dm_crypt:crypt_map+0xcd/0x304
Feb  8 04:46:10 lh kernel:  [<ffffffff880d92bf>] :dm_mod:__map_bio+0x47/0x9b
Feb  8 04:46:10 lh kernel:  [<ffffffff880d9c1f>] :dm_mod:__split_bio+0x172/0x37d
Feb  8 04:46:10 lh kernel:  [<ffffffff880da432>] :dm_mod:dm_request+0x101/0x110
Feb  8 04:46:10 lh kernel:  [<ffffffff80219f55>] generic_make_request+0x13a/0x14d
Feb  8 04:46:10 lh kernel:  [<ffffffff80231028>] submit_bio+0xcb/0xd2
Feb  8 04:46:10 lh kernel:  [<ffffffff8022aaa5>] __bio_add_page+0x188/0x1ce
Feb  8 04:46:10 lh kernel:  [<ffffffff883ccd8b>] :xfs:xfs_submit_ioend_bio+0x1e/0x27
Feb  8 04:46:10 lh kernel:  [<ffffffff883cd7c3>] :xfs:xfs_page_state_convert+0xa2f/0xb6e
Feb  8 04:46:10 lh kernel:  [<ffffffff883cdb30>] :xfs:xfs_vm_writepage+0xa7/0xdd
Feb  8 04:46:10 lh kernel:  [<ffffffff8021ac61>] mpage_writepages+0x1a6/0x34d
Feb  8 04:46:10 lh kernel:  [<ffffffff883cda89>] :xfs:xfs_vm_writepage+0x0/0xdd
Feb  8 04:46:10 lh kernel:  [<ffffffff80256d07>] do_writepages+0x20/0x2f
Feb  8 04:46:10 lh kernel:  [<ffffffff8022dbd7>] __writeback_single_inode+0x1b4/0x38b
Feb  8 04:46:10 lh kernel:  [<ffffffff880d9a46>] :dm_mod:dm_any_congested+0x38/0x3f
Feb  8 04:46:10 lh kernel:  [<ffffffff880db58a>] :dm_mod:dm_table_any_congested+0x46/0x63
Feb  8 04:46:10 lh kernel:  [<ffffffff8021edb1>] sync_sb_inodes+0x1d1/0x2b5
Feb  8 04:46:10 lh kernel:  [<ffffffff802901be>] keventd_create_kthread+0x0/0x61
Feb  8 04:46:10 lh kernel:  [<ffffffff8024c991>] writeback_inodes+0x7d/0xd3
Feb  8 04:46:10 lh kernel:  [<ffffffff802a894a>] background_writeout+0x82/0xb5
Feb  8 04:46:10 lh kernel:  [<ffffffff8025242d>] pdflush+0x0/0x1ed
Feb  8 04:46:10 lh kernel:  [<ffffffff80252570>] pdflush+0x143/0x1ed
Feb  8 04:46:10 lh kernel:  [<ffffffff802a88c8>] background_writeout+0x0/0xb5
Feb  8 04:46:10 lh kernel:  [<ffffffff8023055a>] kthread+0xd4/0x107
Feb  8 04:46:10 lh kernel:  [<ffffffff80259360>] child_rip+0xa/0x12
Feb  8 04:46:10 lh kernel:  [<ffffffff802901be>] keventd_create_kthread+0x0/0x61
Feb  8 04:46:10 lh kernel:  [<ffffffff80230486>] kthread+0x0/0x107
Feb  8 04:46:10 lh kernel:  [<ffffffff80259356>] child_rip+0x0/0x12
Feb  8 04:46:10 lh kernel:
Feb  8 04:46:10 lh kernel:
Feb  8 04:46:10 lh kernel: Code: 48 8b 07 48 c1 e8 3a 48 8b 14 c5 20 d0 52 80 48 b8 b7 6d db
Feb  8 04:46:10 lh kernel: RIP  [<ffffffff802a749a>] page_to_pfn+0x0/0x33
Feb  8 04:46:10 lh kernel:  RSP <ffff81003e7e97d8>
Feb  8 04:46:10 lh kernel: CR2: 0000000000000000
------------------------------------------------------------

This happened in the beginning of copying a large amount of data (/
and /home) to an empty XFS filesystem in a dm-crypted EVMS partition.
Specifically, I have /dev/evms/XFS1-crypted, which is mapped rather
directly to a single hard disk in a setup where it resides in an LVM2
volume group that spans sda5 and sda6, but XFS1-crypted resides
entirely in the sda5 area. From XFS1-crypted a decrypted volume
XFS1-decrypted has been dm-crypt-mapped using "cryptsetup luksOpen".
This was formatted with mkfs.xfs, mounted and I was copying data to it
when it oopsed.

> 2. http://bugzilla.kernel.org/show_bug.cgi?id=7799
> 
> (esp. the last comment:
> "Bug in dmcrypt. There's been several bugs in dmcrypt that   
> only XFS has triggered and the last of these that I know about   
> was fixed in 2.6.19.")

I wonder also which fix this refers to. The changes in the 2.6.19
branch which are clearly bug fixes seem to have been fixes for
low-memory situations, which this was not.

I can try to reproduce and debug this if that's helpful, just tell me
what I can do to help (I have some, but not too much, kernel
experience, mainly some drivers for the 2.4 series).

	Sami

Attachment: signature.asc
Description: Digital signature


Reply to: