[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#794680: drbd kernel module incompatible with drbd-utils -> kernel panics



Control: tag -1 moreinfo

The panic is in a network communication thread, not in anything
handling commands from drbd-utils, so I'm not convinced that this has
anything to do with the version of the latter.

On Wed, 2015-08-05 at 17:17 +0100, Matthew Vernon wrote:
[...]
> Following that suggestion, I installed the kernel module 8.4.6 from
> upstream, and the kernel has stopped panicking.

The current upstream (in-tree) version is 8.4.5 (still with the same
API and protocol versions).

[...]
> You might argue that drbd upstream's api/proto discrimination is
> inadequate (and perhaps a bug report should go there), but nonetheless
> kernel panics are a serious flaw in the kernel (or the offending
> module) IMAO.

Clearly the driver ought not to crash, but I'm not sure that a
wholesale update is the right solution.

The first plausible return address on the stack points to the instruction after
<http://sources.debian.net/src/linux/3.16.7-ckt11-1%2Bdeb8u2/drivers/block/drbd/drbd_receiver.c/#L5443>
which implies something went wrong in drbd_recv_short()
<http://sources.debian.net/src/linux/3.16.7-ckt11-1%2Bdeb8u2/drivers/block/drbd/drbd_receiver.c/#L478>.
The stack dump (not the call stack) shows:

[ffff8800022f3d90] ffff8800022f3d88 0000000000000010 0000000000000000 0000000000000000
                   iov.iov_base !!! iov.iov_len      msg.msg_name     msg.msg_namelen      
[ffff8800022f3db0] ffff8800022f3d90 0000000000000001 0000000000000000 0000000000000000
                   msg.msg_iov = &iov
                                    msg.msg_iovlen   msg.msg_control  msg.msg_controllen
[ffff8800022f3dd0] 0000000000004100 ffffffffa02577be ffff880016c92080 00000010 00000000
                   msg.msg_flags    return address   &connection->flags
                                                                      header_size
                                                                               received

which looks consistent with the stack frame of drbd_recv_short() (plus
16 bytes from the stack frame of drbd_asender()).  However iov.iov_base
is clearly wrong - it is equal to RSP-8, not the buffer.  It's also
equal to the faulting RIP.

Can you reproduce this with Linux 4.1 (now in unstable)?

Can you reproduce this on bare hardware (without Xen)?

Ben.

-- 
Ben Hutchings
If God had intended Man to program,
we'd have been born with serial I/O ports.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: