[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: rx2660 + debian



On Tue, 26 Apr 2022 08:18:25 +0000
Pedro Miguel Justo <pmsjt@texair.net> wrote:

> > On 2022/Apr/26, at 01:01, Sergei Trofimovich <slyich@gmail.com> wrote:
> > 
> > On Tue, 26 Apr 2022 02:43:00 +0000
> > Pedro Miguel Justo <pmsjt@texair.net> wrote:
> >   
> >>> On 2022/Apr/25, at 14:27, Pedro Miguel Justo <pmsjt@texair.net> wrote:
> >>> 
> >>> 
> >>>   
> >>>> On 2022/Apr/25, at 14:09, Sergei Trofimovich <slyich@gmail.com> wrote:
> >>>> 
> >>>> On Mon, 25 Apr 2022 15:07:58 +0000
> >>>> Pedro Miguel Justo <pmsjt@texair.net> wrote:
> >>>>   
> >>>>>> On 2022/Apr/25, at 01:22, Pedro Miguel Justo <pmsjt@texair.net> wrote:
> >>>>>> 
> >>>>>> 
> >>>>>>   
> >>>>>>> On 2022/Apr/25, at 01:14, Frank Scheiner <frank.scheiner@web.de> wrote:
> >>>>>>> 
> >>>>>>> Hi guys,
> >>>>>>> 
> >>>>>>> On 25.04.22 10:09, John Paul Adrian Glaubitz wrote:      
> >>>>>>>>> From what I can understand by the information in the bugcheck, this is somewhat related to a violation
> >>>>>>>>> in parameter copy from user to kernel during some boot-time, crypto, self-test. Does that sound right?
> >>>>>>>>> If that is the case, how would this be related to FW?      
> >>>>>>>> 
> >>>>>>>> I'm not claiming that it must be related to the firmware, I'm just saying that I don't see this problem
> >>>>>>>> on my RX2660 at all and I have even reinstalled it recently with one of the latest firmware images
> >>>>>>>> without having to pass any parameter to the command line.      
> >>>>>>> 
> >>>>>>> A difference between Adrian's rx2660 and Pedro's rx2660 is Montecito
> >>>>>>> left and Montvale right.
> >>>>>>> 
> >>>>>>> But could still be multiple other reasons we haven't looked at yet in
> >>>>>>> detail:
> >>>>>>> 
> >>>>>>> * amount of memory installed
> >>>>>>> * SMT enabled or not
> >>>>>>> * number of processor modules installed
> >>>>>>> 
> >>>>>>> It might be possible for me to check on my rx2660s (one with Montvale
> >>>>>>> and one with Montecito(s)) tomorrow. I will then also look at my other
> >>>>>>> Itanium gear and gather relevant information.
> >>>>>>>   
> >>>>>> 
> >>>>>> Yes, this sounds mode likely to me too.
> >>>>>> 
> >>>>>> The crypto self-tests seem to be an innocent bystander here. I tried booting the most recent kernel with the option “cryptomgr.notests” and it went much farther. Alas it still failed with another buffer copy validation for a different caller altogether:
> >>>>>> 
> >>>>>> [    3.836466]  [<a000000101353690>] usercopy_abort+0x120/0x130
> >>>>>> [    3.836466]                                 sp=e0000001000cfdf0 bsp=e0000001000c9388
> >>>>>> [    3.836466]  [<a0000001004c5660>] __check_object_size+0x3c0/0x420
> >>>>>> [    3.836466]                                 sp=e0000001000cfe00 bsp=e0000001000c9350
> >>>>>> [    3.836466]  [<a000000100570030>] sys_getcwd+0x250/0x420
> >>>>>> [    3.836466]                                 sp=e0000001000cfe00 bsp=e0000001000c92c8
> >>>>>> [    3.836466]  [<a00000010000c860>] ia64_ret_from_syscall+0x0/0x20
> >>>>>> [    3.836466]                                 sp=e0000001000cfe30 bsp=e0000001000c92c8
> >>>>>> [    3.836466]  [<a000000000040720>] ia64_ivt+0xffffffff00040720/0x400
> >>>>>> [    3.836466]                                 sp=e0000001000d0000 bsp=e0000001000c92c8
> >>>>>> 
> >>>>>> This suggests the bug might be in the logic validating these buffers against the allocations (heap, span, etc).
> >>>>>> 
> >>>>>> I don’t know why hardened_usercopy=off is not being observed by the kernel. As a work-around I am copying myself a new kernel with CONFIG_HARDENED_USERCOPY disabled at the source.     
> >> 
> >> So, I finished compiling my kernel with CONFIG_HARDENED_USERCOPY disabled. Guess what:
> >> 
> >> pmsjt@debian:~$ uname -a
> >> Linux debian 5.17.3-rt17 #2 SMP Mon Apr 25 16:55:00 PDT 2022 ia64 GNU/Linux
> >> 
> >> Yup, the system starts just fine with the most recent kernel. So, two things we can infer from this:
> >> - Yes, usercopy validation appears to be broken. The contours of how broken it is are yet unknown but we’ll have to investigate to see what part of the validation is failing.  
> > 
> > My vague memory tells me that the breakage mechanics is the following:
> > 
> > 1. [ok] mm/usercopy.c:check_kernel_text_object() gets called on various kernel addresses
> > 2. [ok] ia64 address space has at least two widely used kernel addresses:
> >        - 0xe0000000_00000000-0xe000..ff-ffffffff - proper linear mapping, 0xe0000000_00000000 + phys
> >                                                    (so called, RGN_KERNEL = 7)
> >        - 0xa0000001_00000000-0xa0000000_00ffffff - kernel ~linear mapping, 0xa0000001_00000000 + phys
> >                                                    (so called, RGN_GATE = 5)
> > 
> > Both are easy to translate to proper linear address, but the translations
> > are not identical. Note that RGN_GATE has an offset.
> > 
> > ia64's stack for some reason lives in RGN_KERNEL space (probably due to
> > historical reasons before kernel switched to vmapped stacks on other arches).
> > You can see these addresses in backtraces as
> >    sp=e0000001000cfdf0 bsp=e0000001000c9388
> > 
> > ia64's kernel code on the other hand lives in RGN_GATE.
> > You can see these addresses in backtraces as
> >    [<a000000101353690>] usercopy_abort+0x120/0x130
> > 
> > Now, check_kernel_text_object() compares passed kernel address for 2 ranges:
> > - RGN_GATE:
> >    unsigned long textlow = (unsigned long)_stext;
> >    unsigned long texthigh = (unsigned long)_etext;
> >    if (overlaps(ptr, n, textlow, texthigh))
> >        usercopy_abort("kernel text", NULL, to_user, ptr - textlow, n);
> > 
> >    I think this check is correct. There is no address translation.
> > 
> > - RGN_KERNEL:
> >    textlow_linear = (unsigned long)lm_alias(textlow);
> >    texthigh_linear = (unsigned long)lm_alias(texthigh);
> >    if (overlaps(ptr, n, textlow_linear, texthigh_linear))
> >        usercopy_abort("linear kernel text", NULL, to_user,...
> > 
> >    I think this one is wrong. Looking at the definition of lm_alias()
> >    it can ge used only on RGN_KERNEL addresses (which looks useless,
> >    they are already in correct form) and literal symbol names:
> > 
> >      include/linux/mm.h:#define lm_alias(x)  __va(__pa_symbol(x))
> >      include/linux/mm.h:#define __pa_symbol(x)  __pa(RELOC_HIDE((unsigned long)(x), 0))
> >      arch/ia64/include/asm/page.h:#define __pa(x)            ({ia64_va _v; _v.l = (long) (x); _v.f.reg = 0; _v.l;})
> >      #define __va(x)            ({ia64_va _v; _v.l = (long) (x); _v.f.reg = -1; _v.p;})
> > 
> > I think that means __pa_symbol(x) against local variables like textlow
> > is completely broken. 
> > 
> > If lm_alias() works at all on ia64 maybe that is enough to unbreak usercoy:
> > 
> > --- a/mm/usercopy.c
> > +++ b/mm/usercopy.c
> > @@ -133,13 +133,13 @@ static inline void check_kernel_text_object(const unsigned long ptr,
> >         * __pa() is not just the reverse of __va(). This can be detected
> >         * and checked:
> >         */
> > -       textlow_linear = (unsigned long)lm_alias(textlow);
> > +       textlow_linear = (unsigned long)lm_alias(_stext);
> >        /* No different mapping: we're done. */
> >        if (textlow_linear == textlow)
> >                return;
> > 
> >        /* Check the secondary mapping... */
> > -       texthigh_linear = (unsigned long)lm_alias(texthigh);
> > +       texthigh_linear = (unsigned long)lm_alias(_etext);
> >        if (overlaps(ptr, n, textlow_linear, texthigh_linear))
> >                usercopy_abort("linear kernel text", NULL, to_user,
> >                               ptr - textlow_linear, n);
> > 
> > If not here is an ia64-specific workaround:
> > 
> > --- a/mm/usercopy.c
> > +++ b/mm/usercopy.c
> > @@ -133,13 +133,13 @@ static inline void check_kernel_text_object(const unsigned long ptr,
> >         * __pa() is not just the reverse of __va(). This can be detected
> >         * and checked:
> >         */
> > -       textlow_linear = (unsigned long)lm_alias(textlow);
> > +       textlow_linear = (unsigned long)ia64_imva(textlow);
> >        /* No different mapping: we're done. */
> >        if (textlow_linear == textlow)
> >                return;
> > 
> >        /* Check the secondary mapping... */
> > -       texthigh_linear = (unsigned long)lm_alias(texthigh);
> > +       texthigh_linear = (unsigned long)ia64_imva(texthigh);
> >        if (overlaps(ptr, n, textlow_linear, texthigh_linear))
> >                usercopy_abort("linear kernel text", NULL, to_user,
> >                               ptr - textlow_linear, n);
> > 
> > CAVEAT: I did not compile- or run- test it. Might need a bit of pointer/long casting around.
> >   
> 
> I don’t mind at all giving this a try. Sounds quite plausible.
> 
> Do you think the stacks (and RSE) will eventually have to move from RGN_KERNEL to vmapped the same as other archs to avoid issues like this and other to come?

I'd say the move to virtual mapping area would not change things much
even for this specific bug. It would move more addresses into a place
where __pa() macro does not work and thus requires proper resolution
to get linear address.

Moving to vmap on ia64 does not have a benefit of guarding against stack
overflows that other targets have: backing store clashes into memory stack
first on overflows somewhere in the middle of the stack area. On the other
hand vmap does help in getting rid of higher order allocations for stacks.

Implementing vmapped stacks is probably a bit tricky: ia64 is an architecture
where TLB cache registers (itc*/dtc*) are maintained in software (with some
hardware assistance from VHPT walker). To make things simpler ia64 sets static
TLB registers (itr*/dtr*) to kernel as is: code around IA64_TR_KERNEL and
IA64_TR_CURRENT_STACK has a few assumptions around the stack layout.

I'm not sure if TLB fault handlers would just work for vmapped stack. Maybe?

-- 

  Sergei


Reply to: