[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#561203: threads and fork on machine with VIPT-WB cache



On 04/02/2010 09:35 PM, John David Anglin wrote:
> On Fri, 02 Apr 2010, NIIBE Yutaka wrote:
> 
>> NIIBE Yutaka wrote:
>>> To have same semantics as other archs, I think that VIPT-WB cache
>>> machine should have cache flush at ptep_set_wrprotect, so that memory
>>> of the page has up-to-date data.  Yes, it will be huge performance
>>> impact for fork.  But I don't find any good solution other than this
>>> yet.
>>
>> I think we could do something like (only for VIPT-WB cache machine):
>>
>> -	static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
>> address, pte_t *ptep)
>>
>> +	static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct 
>> mm_struct *mm, unsigned long addr, pte_t *ptep)
>> 	{
>> 		pte_t old_pte = *ptep;
>> +		if (atomic_read(&mm->mm_users) > 1)
>> +			flush_cache_page(vma, addr, pte_pfn(old_pte));
>> 		set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
>> 	}
> 
> I tested the hack below on two machines currently running 2.6.33.2
> UP kernels.  The change seems to fix Debian #561203 (minifail bug)!
> Thus, I definitely think you are on the right track.  I'll continue
> to test.
> 
> I suspect the same issue is present for SMP kernels.

Hi Dave,

I tested your patch today on one of my machines with plain kernel 2.6.33 (32bit, SMP, B2000 I think).
Sadly I still did see the minifail bug.

Are you sure, that the patch fixed this bug for you?

Helge

do_page_fault() pid=21470 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=7986 command='minifail3' type=6 address=0x00000003                                                                                 
do_page_fault() pid=19952 command='minifail3' type=6 address=0x00000003                                                                                
do_page_fault() pid=13549 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=21862 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=4615 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=17336 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=21986 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=2157 command='minifail3' type=15 address=0x000000dc
do_page_fault() pid=23886 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=2681 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3229 command='minifail3' type=15 address=0x000000ec
do_page_fault() pid=26095 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=20722 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=19912 command='minifail3' type=15 address=0x000000ec
...
pagealloc: memory corruption
7db0c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
7db0c790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
7db0c7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
7db0c7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Backtrace:
 [<1011ec14>] show_stack+0x18/0x28
 [<10117ba0>] dump_stack+0x1c/0x2c
 [<101c6594>] kernel_map_pages+0x2a0/0x2b8
 [<1019e6c8>] get_page_from_freelist+0x3d4/0x614
 [<1019ea3c>] __alloc_pages_nodemask+0x134/0x610
 [<101b1d20>] do_wp_page+0x268/0xac0
 [<101b3b34>] handle_mm_fault+0x4d4/0x7c4
 [<1011d854>] do_page_fault+0x1f8/0x2fc
 [<1011f450>] handle_interruption+0xec/0x730
 [<10103078>] intr_check_sig+0x0/0x34
...
do_page_fault() pid=13414 command='minifail3' type=15 address=0x000000dc
do_page_fault() pid=22776 command='minifail3' type=15 address=0x00000000
do_page_fault() pid=26290 command='minifail3' type=15 address=0x000000ec
do_page_fault() pid=1399 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=16130 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=26401 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3383 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3400 command='minifail3' type=15 address=0x00000004
do_page_fault() pid=18659 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3730 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=28828 command='minifail3' type=6 address=0x00000003



Reply to: