xfs snapshot crash on 2.6.11

To: Debian Kernel <debian-kernel@lists.debian.org>
Subject: xfs snapshot crash on 2.6.11
From: Satadru Pramanik <satadru@gmail.com>
Date: Mon, 27 Jun 2005 09:51:23 -0400
Message-id: <[🔎] 5DFFAF75-F771-472A-A672-45B44833C5EA@gmail.com>

I have have had an xfs snapshotting process running every four hoursfor the past week, which creates a xfs snapshot, does a backup ofrecently changed files using tar, and then removes the snapshot.

I noticed this morning that I have been receiving these errors overthe last day:

XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4340 of file fs/xfs/xfs_bmap.c. Caller 0xc0215937

[<c01ecaf9>] xfs_bmap_read_extents+0x409/0x540
[<c0215937>] xfs_iread_extents+0x77/0x110
[<c0215937>] xfs_iread_extents+0x77/0x110
[<c01ece7e>] xfs_bmapi+0x24e/0x17b0
[<c0215956>] xfs_iread_extents+0x96/0x110
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c0306f3b>] __clone_and_map+0xfb/0x3a0
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c030730b>] __split_bio+0x12b/0x150
[<c0219c0a>] xfs_iomap+0x19a/0x560
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c0219c80>] xfs_iomap+0x210/0x560
[<c023bf47>] __linvfs_get_block+0x97/0x360
[<c0164363>] recalc_bh_state+0x13/0xe0
[<c0162048>] set_bh_page+0x48/0x50
[<c01616ca>] alloc_page_buffers+0x6a/0xc0
[<c023c254>] linvfs_get_block+0x44/0x50
[<c0162d63>] block_read_full_page+0x243/0x330
[<c023c254>] linvfs_get_block+0x44/0x50
[<c023c210>] linvfs_get_block+0x0/0x50
[<c0184a06>] do_mpage_readpage+0x396/0x4c0
[<c023c210>] linvfs_get_block+0x0/0x50
[<c013ec50>] find_or_create_page+0x30/0xd0
[<c023cd82>] _pagebuf_lookup_pages+0x202/0x350
[<c02671b3>] radix_tree_node_alloc+0x23/0x70
[<c026721f>] radix_tree_preload+0x1f/0xd0
[<c026749d>] radix_tree_insert+0xed/0x110
[<c013e7ee>] add_to_page_cache+0x7e/0xe0
[<c0184c61>] mpage_readpages+0x131/0x160
[<c023c210>] linvfs_get_block+0x0/0x50
[<c0142f80>] prep_new_page+0x60/0x70
[<c0145f22>] read_pages+0x132/0x140
[<c023c210>] linvfs_get_block+0x0/0x50
[<c0143a33>] __alloc_pages+0x2e3/0x420
[<c014602d>] __do_page_cache_readahead+0xfd/0x160
[<c014623a>] blockable_page_cache_readahead+0x3a/0x80
[<c01464cd>] page_cache_readahead+0x24d/0x2d0
[<c013f4e4>] do_generic_mapping_read+0x614/0x630
[<c013f802>] __generic_file_aio_read+0x212/0x250
[<c013f500>] file_read_actor+0x0/0xf0
[<c0116c3f>] activate_task+0x6f/0xb0
[<c0242b28>] xfs_read+0x138/0x310
[<c023ee8e>] linvfs_aio_read+0x8e/0xa0
[<c015ef57>] do_sync_read+0xb7/0xf0
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c03862d6>] _spin_lock+0x16/0x90
[<c019118a>] dnotify_parent+0x3a/0xb0
[<c015f075>] vfs_read+0xe5/0x160
[<c015f391>] sys_read+0x51/0x80
[<c010328f>] syscall_call+0x7/0xb

Then this morning, the automated tar process I was using on thesnapshot caused the following error:

Access to block zero: fs: <dm-3> inode: 159767047 start_block : 0start_off : 0 blkcnt : 0 extent-state : 0

------------[ cut here ]------------
kernel BUG at fs/xfs/support/debug.c:106!
invalid operand: 0000 [#1]
PREEMPT SMP

Modules linked in: dgap dgrp parport_pc lp parport md5 ipv6 st shpchppci_hotplug joydev ehci_hcd usbhid uhci_hcd siimage piix e1000 tsdevevdev dm_mirror dm_snapshot psmouse ide_disk ide_cd ide_core cdrom unix

CPU:    1
EIP:    0060:[<c0246441>]    Not tainted VLI
EFLAGS: 00010246   (2.6.11)
EIP is at cmn_err+0xa1/0xc0
eax: 00000000   ebx: c03aca80   ecx: ffffffff   edx: 10000000
esi: 00000000   edi: c04e9120   ebp: 00000000   esp: d0587868
ds: 007b   es: 007b   ss: 0068
Process tar (pid: 23592, threadinfo=d0586000 task=d7650a80)

Stack: c03a06d5 c03a6542 c04e9120 00000286 c03aca80 00000000 00000000d05879b8c01eb937 00000000 c03aca80 c965fc80 0985da07 0000000000000000 0000000000000000 00000000 00000000 00000000 00000000 00000310ddf66aac 00000000

Call Trace:
[<c01eb937>] xfs_bmap_search_extents+0x107/0x130
[<c01ecee1>] xfs_bmapi+0x2b1/0x17b0
[<c0306c4a>] __map_bio+0x4a/0x120
[<c0306f3b>] __clone_and_map+0xfb/0x3a0
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c030730b>] __split_bio+0x12b/0x150
[<f8987f99>] snapshot_map+0x89/0x3c0 [dm_snapshot]
[<c02c824d>] generic_make_request+0x17d/0x220
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c0219c0a>] xfs_iomap+0x19a/0x560
[<c023bf47>] __linvfs_get_block+0x97/0x360
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c030730b>] __split_bio+0x12b/0x150
[<c0164757>] bio_alloc+0xe7/0x1e0
[<c023c254>] linvfs_get_block+0x44/0x50
[<c01847c8>] do_mpage_readpage+0x158/0x4c0
[<c0122ad2>] __do_softirq+0x62/0xd0
[<c038676f>] _spin_unlock_irqrestore+0xf/0x30
[<c02e1211>] megaraid_mbox_build_cmd+0x891/0xca0
[<c0142263>] mempool_alloc+0x73/0x140
[<c026721f>] radix_tree_preload+0x1f/0xd0
[<c013e7ee>] add_to_page_cache+0x7e/0xe0
[<c0184c61>] mpage_readpages+0x131/0x160
[<c023c210>] linvfs_get_block+0x0/0x50
[<c0101a93>] __switch_to+0x23/0x1c0
[<c0142f80>] prep_new_page+0x60/0x70
[<c0145f22>] read_pages+0x132/0x140
[<c023c210>] linvfs_get_block+0x0/0x50
[<c0143a33>] __alloc_pages+0x2e3/0x420
[<c014602d>] __do_page_cache_readahead+0xfd/0x160
[<c014623a>] blockable_page_cache_readahead+0x3a/0x80
[<c01464cd>] page_cache_readahead+0x24d/0x2d0
[<c013f4e4>] do_generic_mapping_read+0x614/0x630
[<c013f802>] __generic_file_aio_read+0x212/0x250
[<c013f500>] file_read_actor+0x0/0xf0
[<c0116c3f>] activate_task+0x6f/0xb0
[<c0242b28>] xfs_read+0x138/0x310
[<c023ee8e>] linvfs_aio_read+0x8e/0xa0
[<c015ef57>] do_sync_read+0xb7/0xf0
[<c0132f10>] autoremove_wake_function+0x0/0x60
[<c03862d6>] _spin_lock+0x16/0x90
[<c019118a>] dnotify_parent+0x3a/0xb0
[<c015f075>] vfs_read+0xe5/0x160
[<c015f391>] sys_read+0x51/0x80
[<c010328f>] syscall_call+0x7/0xb

Code: b8 20 91 4e c0 89 44 24 08 8b 04 ad 20 89 3f c0 89 44 24 04 e861 75 ed ff 8b 54 24 0c b8 04 89 3f c0 e8 23 03 14 00 85 ed 75 08<0f> 0b 6a 00 52 65 3a c0 83 c4 10 5b 5e 5f 5d c3 eb 0d 90 90 90

-=-

Immediately after the tar process crashed, my script continued andattempted to unmount the snapshot volume and lvremove the snapshotdevice, both of which appear to have failed.

Is this a fs corruption issue? A XFS module issue? A snapshotmodule issue? I am using Debian/Sarge with a 2.6.11 debian kernelrebuilt with the newer megaraid drivers.


Regards,

satadru

Reply to:

Prev by Date: Bug#315931: kernel-image-2.6.8-2-k7: full stack trace
Next by Date: Bug#315968: kernel: usb-storage not working, system freeze
Previous by thread: Bug#315931: kernel-image-2.6.8-2-k7: full stack trace
Next by thread: Bug#315968: kernel: usb-storage not working, system freeze
Index(es):
- Date
- Thread