Re: Bug#1093200: Some packages consistently FTBFS with EFAULT (Bad address) on most mips64el buildds
Hi!
On Tue, Feb 4, 2025 at 11:59 AM Sergei Golovan <sgolovan@debian.org> wrote:
>
> Hi!
>
> On Wed, Jan 29, 2025 at 10:45 PM Uwe Kleine-König <ukleinek@debian.org> wrote:
> >
> > Hello,
> >
> > I agree this looks like the kernel is at least involved in this problem.
> > Is someone able to do a bisect to help the kernel team to pinpoint the
> > issue?
> >
> > Or does someone has a reproducer that also works for someone without
> > mips64el hardware (probably something involving qemu)?
>
> I've tried to reproduce the Erlang FTBFS in qemu (both on malta and
> loongson-virt machines) and failed to do so.
> Also, I've found a thread on linux-kernel mailing list, which might be
> relevant (see [1]). It describes some EFAULT on
> mips64 which were introduced roughly at the same time when Erlang
> started to FTBFS. I've run the test they
> suggested (on qemu on loongson-virt machine).
I've tried to revert the 4bce37a68ff884e821a02a731897a8119e0c37b7 commit
mentioned in [1] and modified the code after the reversal to adapt it to the
changes in the prototype of expand_stack() (a few insights were found
in [2]). After that, the test program from [1] started working on kernel
6.1.123 in qemu (on machine loongson3-virt).
I can't say I know what I'm doing, so can someone review and try the
attached patch
on real hardware and test if the bug from [1] is indeed connected to #1093200
(and also to #1093859, #1087809, #1086028)?
(Some Erlang-based packages have already been removed from testing
(e.g. wings3d) because
Erlang FTBFS on mips64el, it really bothers me.)
[1] https://lore.kernel.org/all/mvmplxraqmd.fsf@suse.de/T/
[2] https://github.com/torvalds/linux/commit/8d7071af890768438c14db6172cc8f9f4d04e184
Cheers!
--
Sergei Golovan
From: Sergei Golovan <sgolovan@debian.org>
Date: Wed, 05 Feb 2025 15:47:06 +0300
Subject: [PATCH] mips/mm: Revert converting to using lock_mm_and_find_vma()
The patch reverts 4bce37a68ff884e821a02a731897a8119e0c37b7 and
adapts the code to the changes in the expand_stack() prototype
using examples from 8d7071af890768438c14db6172cc8f9f4d04e184
.
Hopefully, this should fix #1093200, #1093859, #1087809, #1086028
Bug: https://lore.kernel.org/all/mvmplxraqmd.fsf@suse.de/T/
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -97,7 +97,6 @@
select HAVE_VIRT_CPU_ACCOUNTING_GEN if 64BIT || !SMP
select IRQ_FORCED_THREADING
select ISA if EISA
- select LOCK_MM_AND_FIND_VMA
select MODULES_USE_ELF_REL if MODULES
select MODULES_USE_ELF_RELA if MODULES && 64BIT
select PERF_USE_VMALLOC
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -100,13 +100,22 @@
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
- vma = lock_mm_and_find_vma(mm, address, regs);
+ mmap_read_lock(mm);
+ vma = find_vma(mm, address);
+ if (!vma)
+ goto bad_area;
+ if (vma->vm_start <= address)
+ goto good_area;
+ if (!(vma->vm_flags & VM_GROWSDOWN))
+ goto bad_area;
+ vma = expand_stack(mm, address);
if (!vma)
goto bad_area_nosemaphore;
/*
* Ok, we have a good vm_area for this memory access, so
* we can handle it..
*/
+good_area:
si_code = SEGV_ACCERR;
if (write) {
Reply to: