Re: rx2660 + debian
> On 2022/Apr/25, at 01:14, Frank Scheiner <firstname.lastname@example.org> wrote:
> Hi guys,
> On 25.04.22 10:09, John Paul Adrian Glaubitz wrote:
>>> From what I can understand by the information in the bugcheck, this is somewhat related to a violation
>>> in parameter copy from user to kernel during some boot-time, crypto, self-test. Does that sound right?
>>> If that is the case, how would this be related to FW?
>> I'm not claiming that it must be related to the firmware, I'm just saying that I don't see this problem
>> on my RX2660 at all and I have even reinstalled it recently with one of the latest firmware images
>> without having to pass any parameter to the command line.
> A difference between Adrian's rx2660 and Pedro's rx2660 is Montecito
> left and Montvale right.
> But could still be multiple other reasons we haven't looked at yet in
> * amount of memory installed
> * SMT enabled or not
> * number of processor modules installed
> It might be possible for me to check on my rx2660s (one with Montvale
> and one with Montecito(s)) tomorrow. I will then also look at my other
> Itanium gear and gather relevant information.
Yes, this sounds mode likely to me too.
The crypto self-tests seem to be an innocent bystander here. I tried booting the most recent kernel with the option “cryptomgr.notests” and it went much farther. Alas it still failed with another buffer copy validation for a different caller altogether:
[ 3.836466] [<a000000101353690>] usercopy_abort+0x120/0x130
[ 3.836466] sp=e0000001000cfdf0 bsp=e0000001000c9388
[ 3.836466] [<a0000001004c5660>] __check_object_size+0x3c0/0x420
[ 3.836466] sp=e0000001000cfe00 bsp=e0000001000c9350
[ 3.836466] [<a000000100570030>] sys_getcwd+0x250/0x420
[ 3.836466] sp=e0000001000cfe00 bsp=e0000001000c92c8
[ 3.836466] [<a00000010000c860>] ia64_ret_from_syscall+0x0/0x20
[ 3.836466] sp=e0000001000cfe30 bsp=e0000001000c92c8
[ 3.836466] [<a000000000040720>] ia64_ivt+0xffffffff00040720/0x400
[ 3.836466] sp=e0000001000d0000 bsp=e0000001000c92c8
This suggests the bug might be in the logic validating these buffers against the allocations (heap, span, etc).
I don’t know why hardened_usercopy=off is not being observed by the kernel. As a work-around I am copying myself a new kernel with CONFIG_HARDENED_USERCOPY disabled at the source.