[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#801925: marked as done (NULL pointer dereference: IP: [<f828a00c>] sr_runtime_suspend+0xc/0x20 [sr_mod])



Your message dated Fri, 12 Feb 2016 23:31:31 +0000
with message-id <1455319891.2801.67.camel@decadent.org.uk>
and subject line Re: NULL pointer dereference: IP: [<f828a00c>] sr_runtime_suspend+0xc/0x20 [sr_mod]
has caused the Debian Bug report #801925,
regarding NULL pointer dereference: IP: [<f828a00c>] sr_runtime_suspend+0xc/0x20 [sr_mod]
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
801925: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=801925
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: linux-image-4.2.0-1-686-pae
Version: 4.2.3-2
Severity: important


Dear Linux SCSI folks,


please don’t include the address submit@bugs.debian.org in your reply.


Am Freitag, den 16.10.2015, 03:05 +0200 schrieb Paul Menzel:

> using Debian Sid/unstable with Linux 4.2.3-1 upgrading from systemd
> 227-1 to 227-2 [1] and other packages, the system doesn’t start up
> anymore and the /dev/md1 device doesn’t seem to be found and I am
> dropped into shell from initramfs (BusyBox).
> 
> Only having wireless LAN and no serial or USB debug capabilities, and
> mount a USB storage device did not work, I manually copied the beginning
> of the Oops.
> 
> ```
> BUG: unable to handle kernel NULL pointer dereference at 00000014
> IP: [<f828a00c>] sr_runtime_suspend+0xc/0x20 [sr_mod]
> *pdpt = 000000003696e001 *pde = 000000000000000000
> Oops: 0000 [#1] SMB
> Modules linked in: sd_mod(+) sr_mod(+) cdrom ata_generic ohci_pci ahci libahci pata_amd firwire_ohci firewire_core crc_iti_t forcedeth libata scsi_mod ohci_hcd ehci_pci ehci_hcd usbcore usb_common fan thermal thermal_sys floppy(+)
> CPU: 1 PID: 73 Comm: systemd-udevd Not tainted 4.2.0-1-686-pae #1 Debian 4.2.3-1
> Hardware name: Packard Bell imedia S3210/WMCP78M, BIOs P01-B2 11/06/2009
> task: f68dd040 ti: f6988000 task.ti: f6988000
> EIP: 0060:[<fh28a00c>] EFLAGS: 00010246 CPU: 1
> EIP is at sr_runtime_suspend+0xc/0x20 [sr_mod]
> EAX: 00000000 EBX: f6a30cd8 ECX: f6c03d2c EDX: 00000000
> ESI: 00000000 EDI: f828e100 EBP: f6989ba8 ESP: f6989b88
>  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> CR0: 8005003b CR2: 00000014 CR3: 3696d780 CR4: 000006f0
> Stack:
>  af83346c3 00000000 00000001 fffffff5 f6a7d150 f6a30cd8 f6a30d3c 00000000
>  f6989bbc c1390cb7 f6a30cd8 f8334660 00000000 f6989bd0 c1390d0f f6a30cd8
>  f8334660 00000000 f6989c0c c13916cb f694a614 f68dd040 00000000 00000008
> Call Trace:
>  […] ? scsi_runtime_suspend+0x63/0xa0 [scsi_mod]
>  […] ? __rpm_callback+0x27/0x60
> […]
> ```
> 
> I tried also to boot with Linux 4.1 and it fails the same way.
> 
> Is that a known problem and has been fixed in the mean time? It’d be
> great if you helped me getting the system to boot again. Please tell me
> if you need more information to debug this issue and I’ll do my best to
> get it.

Ben Hutchings asked me to test the patch below to get more debug
information.

```
diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 8bd54a6..dd5b5b2 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -144,6 +144,12 @@ static int sr_runtime_suspend(struct device *dev)
 {
 	struct scsi_cd *cd = dev_get_drvdata(dev);
 
+	if (WARN_ON(!cd)) {
+		pr_info("%s: cd == NULL; power.usage_count = %d\n",
+			__func__, atomic_read(&dev->power.usage_count));
+		return 0;
+	}
+
 	if (cd->media_present)
 		return -EBUSY;
 	else
@@ -652,7 +658,13 @@ static int sr_probe(struct device *dev)
 	struct scsi_cd *cd;
 	int minor, error;
 
-	scsi_autopm_get_device(sdev);
+	error = scsi_autopm_get_device(sdev);
+	if (error) {
+		pr_err("%s: scsi_autopm_get_device returned %d\n",
+		       __func__, error);
+		return error;
+	}
+
 	error = -ENODEV;
 	if (sdev->type != TYPE_ROM && sdev->type != TYPE_WORM)
 		goto fail;
@@ -719,6 +731,9 @@ static int sr_probe(struct device *dev)
 	if (register_cdrom(&cd->cdi))
 		goto fail_put;
 
+	pr_info("%s: power.usage_count = %d\n",
+		__func__, atomic_read(&dev->power.usage_count));
+
 	/*
 	 * Initialize block layer runtime PM stuffs before the
 	 * periodic event checking request gets started in add_disk.
```

I’ll try that as soon as a spare drive has arrived, where I can copy the
data to as a backup.

More thoughts are welcome! Especially, if that error suggests a failing
drive or not.


Thanks,

Paul


> [1] http://metadata.ftp-master.debian.org/changelogs//main/s/systemd/systemd_227-2_changelog
-- 
GPG-Schlüssel: 33623E9B
Fingerabdruck = 0EB1 649D 4361 D04F 3C70  6F71 4DD7 BF75 3362 3E9B

Giant Monkey Software Engineering GmbH

Brunnenstr. 7D
10119 Berlin Mitte

Geschäftsführer Adrian Fuhrmann, Lion Vollnhals und Paul Menzel

USt-IdNr.: DE281524720
HRB 139495 B Amtsgericht Charlottenburg

Attachment: signature.asc
Description: This is a digitally signed message part


--- End Message ---
--- Begin Message ---
Version: 4.4.1-1~exp1

On Tue, 2016-02-09 at 20:51 +0000, Ben Hutchings wrote:
> On Tue, 2016-02-09 at 20:56 +0100, Alexandre Rossi wrote:
> > Hi,
> > 
> > netconsole does not seem to work so early in the boot process this time.
> > 
> > > As this is Linux 4.3 and not 4.4, I guess this is a different problem
> > > though. Alexandre, where you able to capture the stack trace? I’d submit
> > > a new bug report with this.
> > 
> > Here is a photo. Please ping me if you need to test some debugging patches.
> 
> I'm pretty sure this crash is fixed by commit 4fd41a8552af ("SCSI: Fix NULL
> pointer dereference in runtime PM"), which I've now queued up for 4.3
> (though it's already in 4.4 which I'll probably upload to unstable soon).

The version above includes both fixes.

Ben.

-- 
Ben Hutchings
I say we take off; nuke the site from orbit.  It's the only way to be sure.

Attachment: signature.asc
Description: This is a digitally signed message part


--- End Message ---

Reply to: