[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Serious problem with ticket spinlocks on ia64



Replacing with an actually meaningful example (sorry - it is sunday after all): 

Initial:
[a1] = [a2] = 0


Core1:
st.rel	[a1] = 0x1
ld.acq	r1 = [a2]
cmp	.eq	p1 = r1, zero
p1		<exclusive region>
st.rel	[a1] = zero


Core2:
st.rel	[a2] = 0x1
ld.acq	r1 = [a1]
cmp	.eq	p2 = r1, zero
p2		<exclusive region>
st.rel	[a2] = zero



This is the typical try-acquire exclusive region as per Peterson's, compiled with a ISO/ECMA CLI C compiler.
p1 (on core 1) = 1 and p2 (on code 1) = 1 is a possible combination on ia64, letting both cores executing a critical region concurrently in 2 cores.

PT


On Aug 29, 2010, at 11:45 AM, Pedro Miguel Sequeira de Justo Teixeira wrote:

> 
> 
> Acquire operations are allowed to occur before Release operations. There is nothing preventing that from happening. If this code's synchronization safety (I am not looking at the code) is based on st.rel being a full fence, than it is wrong.
> 
> 
> Initial [addr1] = [addr2] = 0x0
> 
> Core1:
> st		[addr1] = 0x5A
> st.rel	[addr2] = 0xA5
> 
> Core2:
> ld.acq	r3 = [addr2]
> ld		r4 = [addr1]
> 
> 
> It is possible to have this resulting in r3 = 0 and r4 = 0x5A.
> 
> This is why Dekkard's/Peterson's doesn't work when "volatile" is simply implemented by st.rel and ld.acq.
> 
> PT
> 
> 
> On Aug 29, 2010, at 5:34 AM, Petr Tesarik wrote:
> 
>> On Saturday 28 of August 2010 00:30:11 you wrote:
>>> Sorry to barge in but... what is preventing fetchadd4.acq from reaching the
>>> value present before st4.rel?
>> 
>> First, I'm no expert on ia64 low-level detail, such as the formal 
>> specification of the memory ordering. So I don't know. ;)
>> 
>> Second, there is no st4.rel, there is only st2.rel on the upper half on the 
>> double-word. The main problem here is that a subsequenct ld4.acq still sees 
>> the unincremented value.
>> 
>> Petr Tesarik
>> 
>>> On Fri, Aug 27, 2010 at 3:13 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
>>>> On Friday 27 of August 2010 23:11:55 Luck, Tony wrote:
>>>>>> One more idea. The wrap-around case is the only one when the high
>>>>>> word
>>>> 
>>>> is
>>>> 
>>>>>> modified. This is in fact the only case when the fetchadd.acq
>>>>>> competes with the st2.rel about the actual contents of that location.
>>>>>> I don't
>>>> 
>>>> know
>>>> 
>>>>>> if it matters...
>>>>> 
>>>>> I pondered that for a while - but I have difficulty believing that
>>>>> fetchadd looks at which bits changed and only writes back the bytes
>>>>> that did.
>>>> 
>>>> OTOH the counter is only 15-bit, so it also wraps around at 0xfffe7fff,
>>>> but I have never seen it fail there. It always fails after the
>>>> wrap-around from 0xfffeffff.
>>>> 
>>>> Petr Tesarik
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Reply to: