[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Help on memchr() EGLIBC assembly code



On Wed, Jul 29, 2009 at 05:24:59PM -0700, Richard Henderson wrote:
> On 07/26/2009 04:45 PM, Aurelien Jarno wrote:
>> Knowing that $31 could be used for prefetch, I have modified the
>> assembly code from memchr.S to use it. It passes all the testsuite.
>>
>
> This isn't intended to be a prefetch instruction, it's
> meant to be fetching the data for the next word.  I.e.
> we've unrolled the loop and there's at least 8 bytes
> left in the search.
>
> Note the
>
>         # At least two quads remain to be accessed.
>
> comment.  At that point we're supposed to be more
> than 16 bytes away from the end of the input buffer.
>
> Actually, the confusion I see is farther upthread:
>
>> > >>>>> The problem is that the memchr() function on alpha uses  
>> prefetch, which
>> > >>>>> can cause a page boundary to be crossed, while the standards  
>> (POSIX and
>> > >>>>> C99) says it should stop when a match is found.
>
> I didn't realize this when I wrote the function.
>
> The entire function should be rewritten, since there's little
> point in using a prefetch instruction that close to the load.
> Prefetch instructions should be used to move data into the L1
> cache, not hide the 3 cycle load delay.  Thus a prefetch, if
> used, should be several cache lines ahead, not just a single word.
>
> Perhaps a better solution would be to read words until we get
> cacheline aligned, then read the entire line into 8 registers,
> prefetch the next line, then process each register one by one.
>
> Try this.
>

Thanks for this patch I have tried it, and it does not have the original
problem I have reported. Unfortunately it does not pass the glibc
testsuite. I'll try to debug the problem later (I don't own an alpha
machine, and need to have internet access to debug it remotely).

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net


Reply to: