rumpdisk device timeouts

To: debian-hurd@lists.debian.org
Subject: rumpdisk device timeouts
From: Michael Kelly <mike@weatherwax.co.uk>
Date: Sat, 30 Aug 2025 11:08:40 +0100
Message-id: <[🔎] 76f0a475-3279-44a5-8798-f7805c1e2656@weatherwax.co.uk>

I've been using the 32 bit Hurd for my stress-ng testing so as toisolate from developments in the 64 bit version. I've used virtualmachines using 'QEMU HARDDISK' and have repeatedly hit read and writetimeouts when using rump disk but no evidence of any similar failuresusing the Linux block driver without rumpdisk.


For example:

[ 2011.9300050] wd0d: device timeout reading fsbn 10184704 of10184704-10184711 (wd0 bn 10184704; cn 10103 tn 13 sn 61), xfer d20,\

 retry 0

[ 2011.9300050] wd0d: device timeout reading fsbn 449552 of449552-449559 (wd0 bn 449552; cn 445 tn 15 sn 47), xfer ec0, retry 0[ 2011.9300050] wd0d: device timeout reading fsbn 440176 of440176-440183 (wd0 bn 440176; cn 436 tn 10 sn 58), xfer b80, retry 0[ 2011.9300050] wd0d: device timeout reading fsbn 1502104 of1502104-1502111 (wd0 bn 1502104; cn 1490 tn 2 sn 58), xfer f90, retry\

[ 10508.3700050] wd0d: device timeout writing fsbn 176616 of176616-176623 (wd0 bn 176616; cn 175 tn 3 sn 27), xfer d88, retry 0[ 10508.3700050] wd0d: device timeout writing fsbn 176624 of176624-176631 (wd0 bn 176624; cn 175 tn 3 sn 35), xfer f90, retry 0[ 10518.8700050] wd0d: device timeout writing fsbn 176624 of176624-176631 (wd0 bn 176624; cn 175 tn 3 sn 35), xfer f90, retry 1

I've also had a number of occasions where the rumpdisk task was seminglythe central figure in a system wide lockup with the kernel debugger userspace stack trace showing, for example:


thread: 32
Continuation mach_msg_continue
>>>>> user space <<<<<
mach_msg_trap 0x822daec(0x81b2360(7b518b4,2,0,18,ae)
__pthread_block 0x81e0369(7c04320,20118840,7b518f8,8120e02,0)

__pthread_cond_timedwait_internal0x81e0ce4(7c04260,7c04b40,ffffffff,0,81e0e09)

pthread_cond_wait 0x81e0e21(7c04260,7c04b40,7c04b40,812c2bd,0)
rumpuser_cv_wait 0x812cadd(7c04260,7c04b40,7b519e8,811bd11,7c04b40)
0x811bd7c(200aefa0,200aef9c,1000,1,0)
rumpns_physio 0x818a37c(806f110,0,303,0,0)
0x8070512(303,0,7b51c64,10,1)
rumpns_cdev_write 0x80d666b(303,0,7b51c64,10,812c2a9)
rumpns_spec_write 0x8160ccb(7b51bac,8354d74,819c98b,8354d74,0)
0x819b5ba(20092000,7b51c64,10,20048040,0)
0x816d399(2011e0c0,7b51cd8,7b51c64,20048040,0)
rumpns_dofilewrite 0x8096a44(3,2011e0c0,71fb000,1000,7b51cd8)
0x817d105(20118840,7b51d58,7b51d50,8124fbe,0)
0x812500e(ae,7b51d58,18,7b51d50,0)
0x8117dac(3,71fb000,1000,20b7b000,0)
rumpdisk_device_write 0x804a413(20013f80,b2,12,0,105bd8)
_Xdevice_write 0x804d371(7b51ee0,7b53ef0,819febb,cb87f3c,7b53ef0)
0x804ab9b(7b51ee0,7b53ef0,7b51e94,0,7b53ef0)
0x804e21b(7b51ee0,7b53ef0,0,0,80001712)
0x81b26da(7b55f98,2000,10,900,1d4c0)
0x804e34b(0,8354d74,7b55fe8,819e545,0)
0x819e589(7c04320,cb87f78,0,0,

The above might not be abnormal, I haven't looked through the code yet,but nevertheless the task was stalled and the cause was not related topage-in (page wiring does seem to be functioning correctly) as it oftenis with other tasks during this test case. I have a virtual machinesnapshot with rumpdisk in the above state if more information is helpful.

With an additional improvement to libports interruptions (which I'llmail separately about), I switched off the non-rumpdisk Hurd test caseafter 20 successful hours whereas using rumpdisk I can only sometimesachieve 90 minutes at the most.

Is a 64 bit rumpdisk virtual machine likely to be any more stable thanthe 32 bit ?

There's no indication that my host machine has any hardware issues but Isuppose that there could be a bug in qemu q35/SATA rather than thei440/IDE setup on the non-rumpdisk guest. I did try a 64 bit Hurdinstall on an old real PC but without any success so far. My 2nd PC hasonly UEFI BIOS so that won't get very far I believe.

Reply to:

Follow-Ups:
- Re: rumpdisk device timeouts
  - From: Samuel Thibault <sthibault@debian.org>
- Re: rumpdisk device timeouts
  - From: Samuel Thibault <sthibault@debian.org>
- Re: rumpdisk device timeouts
  - From: Samuel Thibault <sthibault@debian.org>

Prev by Date: Re: active swapping using rumpdisk
Next by Date: UEFI
Previous by thread: Re: unable to install giit due to git-man missing
Next by thread: Re: rumpdisk device timeouts
Index(es):
- Date
- Thread