Bug#924891: glibc: misc/tst-pkey fails due to cleared PKRU register after signal in amd64 32-bit compat mode
- To: 924891@bugs.debian.org
- Subject: Bug#924891: glibc: misc/tst-pkey fails due to cleared PKRU register after signal in amd64 32-bit compat mode
- From: Aurelien Jarno <aurelien@aurel32.net>
- Date: Sun, 2 Aug 2020 14:14:38 +0200
- Message-id: <[🔎] 20200802121438.GA3724539@aurel32.net>
- Reply-to: Aurelien Jarno <aurelien@aurel32.net>, 924891@bugs.debian.org
- In-reply-to: <20190328210558.GB5515@aurel32.net>
- References: <20190317175722.qgsm6uvonexs35ot@xanadu.blop.info> <87h8buna6l.fsf@mid.deneb.enyo.de> <20190326221047.GA5515@aurel32.net> <20190327073750.pf54ni4wjkd4zu4r@xanadu.blop.info> <87sgv8oj02.fsf@mid.deneb.enyo.de> <20190327210034.gn4svip5e5zqdaui@xanadu.blop.info> <20190317175722.qgsm6uvonexs35ot@xanadu.blop.info> <871s2rq5yd.fsf@mid.deneb.enyo.de> <20190317175722.qgsm6uvonexs35ot@xanadu.blop.info> <20190328210558.GB5515@aurel32.net> <20190317175722.qgsm6uvonexs35ot@xanadu.blop.info>
control: reassign -1 src:linux
control: retitle -1 linux: cleared PKRU register after signal in amd64 32-bit compat mode
control: close -1 4.10-1~exp1
control: affects -1 glibc
Hi,
On 2019-03-28 22:05, Aurelien Jarno wrote:
> Hi Florian,
>
> On 2019-03-27 23:59, Florian Weimer wrote:
> > retitle 924891 glibc: misc/tst-pkey fails due to cleared PKRU register after signal in amd64 32-bit compat mode
> > thanks
> >
> > * Lucas Nussbaum:
> >
> > > On 27/03/19 at 08:48 +0100, Florian Weimer wrote:
> > >> > If that's useful, I can easily provide access to an AWS VM to debug this
> > >> > issue.
> > >>
> > >> Oh, that would be quite helpful indeed.
> > >
> > > Can you send your SSH key? (I thought there was a way to get the SSH key
> > > for a DD, but I cannot find it anymore)
> > >
> > > Then you will be able to ssh to root@18.184.55.40.
> > > There's sbuild and schroot setup on the VM.
> > >
> > > When you are done, please 'poweroff' the machine, which will terminate
> > > it.
> >
> > The issue reproduces outside the chroot, with the stretch userland.
> >
> > What happens is that once we get out of the SIGUSR1 signal handler,
> > the PKRU register has value zero. This happens around this code in
> > the test:
> >
> > /* Check that in a signal handler, there is no access. */
> > xsignal (SIGUSR1, &sigusr1_handler);
> > xraise (SIGUSR1);
> > xsignal (SIGUSR1, SIG_DFL);
> > TEST_COMPARE (sigusr1_handler_ran, 1);
> >
> > I checked the following (via a breakpoint in pkey_get; I don't think
> > GDB can read the PKRU register directly): Inside the SIGUSR1 signal
> > handler, PKRU has value 0x55555554, as expected for this kernel, but
> > after the return, we get zero. This is the first time a signal is
> > delivered on the main thread, so it's consistent with fairly broken
> > signal handling as far as the PKRU register is concerned. I guess
> > clearing PKRU in this way might even constitute a minor security bug
> > (because the zero value means no restrictions).
>
> Thanks a lot for investigating and for all the details.
>
> > This commit looks highly relevant:
> >
> > commit a4455082dc6f0b5d51a23523f77600e8ede47c79
> > Author: Dave Hansen <dave.hansen@linux.intel.com>
> > Date: Wed Jun 8 10:25:33 2016 -0700
> >
> > x86/signals: Add missing signal_compat code for x86 features
> >
> > The 32-bit siginfo is a different binary format than the 64-bit
> > one. So, when running 32-bit binaries on 64-bit kernels, we have
> > to convert the kernel's 64-bit version to a 32-bit version that
> > userspace can grok.
> >
> > If the siginfo_t layout is incorrect (with regards to what the
> > hardware writes), I expect that we might end up copying back the wrong
> > PKRU value.
>
> This commit is actually already in the 4.9 kernel.
>
> > I'm not sure what to do here. This really looks like a kernel bug.
> > Maybe we should just verify that this is fixed in the buster kernel
> > and move on?
>
> I agree. I have been able to find a machine where I can temporarily run
> a VM. I have found that the problem has been solved between kernel
> 4.10-rc6 and 4.10, more precisely between the following debian packages:
> - linux-image-4.10.0-rc6-amd64-unsigned version 4.10~rc6-1~exp1
> - linux-image-4.10.0-trunk-amd64-unsigned version 4.10-1~exp1
>
> I gave a quick look at the commit logs, and I haven't identified a
> commit. I'll look again and try to identify the commit fixing the issue
> so that it can be backported in the stretch kernel. I'll then reassign
> the bug there.
I have identified that the issue has been fixed by the following commit
that went into kernel 4.10-rc7:
commit dffba9a31c7769be3231c420d4b364c92ba3f1ac
Author: Yu-cheng Yu <yu-cheng.yu@intel.com>
Date: Mon Jan 23 14:54:44 2017 -0800
x86/fpu/xstate: Fix xcomp_bv in XSAVES header
The compacted-format XSAVES area is determined at boot time and
never changed after. The field xsave.header.xcomp_bv indicates
which components are in the fixed XSAVES format.
In fpstate_init() we did not set xcomp_bv to reflect the XSAVES
format since at the time there is no valid data.
However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(),
as in commit:
b22cbe404a9c x86/fpu: Fix invalid FPU ptrace state after execve()
and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode
app, a #GP occurs. This can be easily triggered by doing valgrind on
a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and
others:
https://bugzilla.kernel.org/show_bug.cgi?id=190061
Fix it by setting xcomp_bv correctly.
This patch also moves the xcomp_bv initialization to the proper
place, which was in copyin_to_xsaves() as of:
4c833368f0bf x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
which fixed the bug too, but it's more efficient and cleaner to
initialize things once per boot, not for every signal handling
operation.
Reported-by: Kevin Hao <haokexin@gmail.com>
Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: haokexin@gmail.com
Link: http://lkml.kernel.org/r/1485212084-4418-1-git-send-email-yu-cheng.yu@intel.com
[ Combined it with 4c833368f0bf. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This actually makes sense given that on x86 the protection keys register is
saved and restored using XSAVE.
I am therefore reassigning the bug to the kernel, and closing it given it has
been fixed a lot of time ago.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
Reply to: