[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#866122: slapd-mtread crash on ppc64{,el} in stretch/sid



hi Ryan,

On 07/11/2017 02:58 AM, Ryan Tandy wrote:
> Today I built Linux 4.12 from upstream source and the test program still
> crashes. I was looking at your fixes to initialize load_{fp,tm,vec} as well
> as someone else fixing the CONFIG_ALIVEC typo but none of those have helped.

Right, I tested it with the pending patches for HTM and the bug is still
there, so, I doubt is has been fixed already.

> I did confirm on this kernel that reverting 613036d9 still stops it from
> crashing. Tomorrow I will try to narrow it down to a specific change. There
> are only 4 hunks after all (the addition of msr_tm_active cannot be reverted
> as there are more calls to it now).

In fact I just did it and I found that the following patch fixes the
problem.  I am not able to understand why yet. Working on it right now.

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 9f3e2c932dcc..21bcb3b19758 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -231,7 +231,7 @@ void enable_kernel_fp(void)
 EXPORT_SYMBOL(enable_kernel_fp);
 
 static int restore_fp(struct task_struct *tsk) {
-       if (tsk->thread.load_fp || msr_tm_active(tsk->thread.regs->msr)) {
+       if (tsk->thread.load_fp) {
                load_fp_state(&current->thread.fp_state);
                current->thread.load_fp++;
                return 1;

> It turns out it is _not_ compiler dependent. The test program compiled in a
> jessie chroot succeeds in that chroot and then crashes if I run the same
> binary in a stretch chroot. This also means I was wrong about the m{t,f}vsrd
> instructions being related, as gcc-4.9 doesn't emit them (for this particular
> program, at least).

I  understand that glibc might have VSX instructions, so, even if your
application is not using VSX instructions, it might be required
depending on the glibc version you are using, although the problem seems
to be on the float point (FP) side.

> objdump -d libpthread.so.0 output apparently lists some tbegin/tend
> instructions, but I suppose those could be selected at runtime?

Correct. I checked and Debian is enabling HTM[1] to do lock
ellision. It is not a option that you can change on runtime, we would
need to reconfigure/recompile glibc if we want to disable it.  There is
currently an effort to use glibc tunnables to enable/disable lock
elision at runtime, but this is still under development.

Out of curiosity, how did you bisect the kernel to find that commit-id?
Did you do it automatically?

[1] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=ppc64el&ver=2.24-12&stamp=1497900384&raw=0 (Check for --enable-lock-elision)


Reply to: