Re: Using PPC asm (from Linux kernel) in xine
Michel Lanners writes:
> On 22 May, this message from Albert D. Cahalan echoed through cyberspace:
>> =?iso-8859-1?Q?Rog writes:
> [abot fast memcopy in asm]
>
>> dcbt eight,src /* prefetch the next cache line */
>> loop_top:
>> dcba eight,dst /* allocate a cache line */
>> lfd f11,8,(src)
>> lfd f12,16,(src)
>> lfd f13,24,(src)
>> lfdu f14,32,(src)
> ^^
> These should probably be 0,8,16 and 24... same goes below.
Nope. I thought so too, but then lfdu would increment
the pointer by only 24. Instead of that, the pointer
stays back by 8 bytes to compensate. The setup code
would subtract 8 before entering the loop. This then
means I need "eight".
I think the dcbt might not be useful, since there
won't ever be a spare memory cycle for prefetch.
>> dcbi r0,src /* would like to discard the src data */
>> dcbt eight,src /* prefetch the next cache line */
>> stfd f11,8,(dst)
>> stfd f12,16,(dst)
>> stfd f13,24,(dst)
>> stfdu f14,32,(dst)
>> dcbf r0,dst /* write back if needed, then invalidate */
>
> And don't forget the loop counter and some increment...
> Or code it with a decrement operator and copy backwards?
I didn't forget; look again. This is PowerPC. :-)
Here's the assembly with C code:
dcbt eight,src // prefetch the cache line with src[1]...src[4]
loop_top: do{
dcba eight,dst // allocate a cache line for dst[1]...dst[4]
lfd f11,8,(src) double_1 = src[1];
lfd f12,16,(src) double_2 = src[2];
lfd f13,24,(src) double_3 = src[3];
lfdu f14,32,(src) double_4 = src[4]; src += 4;
dcbi r0,src // would like to discard the src[-3]...src[0]
dcbt eight,src // prefetch the cache line with src[1]...src[4]
stfd f11,8,(dst) src[1] = double_1;
stfd f12,16,(dst) src[2] = double_2;
stfd f13,24,(dst) src[3] = double_3;
stfdu f14,32,(dst) dst[4] = double_4; dst += 4;
dcbf r0,dst // write back dst[-3]...dst[0] if needed, then invalidate it
bdnz loop_top }while(--ctr);
The copy should go the opposite direction to the direction
used to fill src[] with data. This way you make better usage
of the cache. The cache control instructions should prevent
this from being very important though.
>> bdnz loop_top
>>
>> BLAH, BLAH...
--
To UNSUBSCRIBE, email to debian-powerpc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Reply to: