[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Using PPC asm (from Linux kernel) in xine



=?iso-8859-1?Q?Rog writes:

> 	According to my tests, cacheable_memcpy is approximately 40%
> 	faster than the original glibc version, which is quite an
> 	improvement: with my tests, the glibc version took approx. 69s
> 	to run, while the cacheable_memcpy took only 42s (repeated
> 	many times to avoid noise errors).

Wait a second... start with a high-level view.
Why is memcpy being used so much? What is it
being called on? (how big is it, is either
address 8-byte or 32-byte aligned, is the
source cached/cacheable, is the destination
cached/cachable, etc.) Maybe you'd better
profile this a bit. Oh well. Some MPC7xx "code"
for you to look at...

Assumptions:

1. huge copies
2. 32-byte alignment (both src & dst)
3. can be cached (both src & dst)

I didn't check to see which FPU registers are
available for a leaf function to abuse. (I forget)
This obviously isn't tested. It might be good to
unroll the loop a bit.

Note that I discard both src and dst. I'm expecting
them to be a megabyte or so, which would just blow
away the cache for no good reason. This way, only
one "way" of the n-way associative cache gets lost.

Play with the ordering a bit.

//////////////////////////////////////////////////////////////////

#define dcba dcbz   /* dcba being removed from Power/PowerPC? */
#define dcbi dcbf   /* dcbi is a supervisor-level instruction */

#define dst r3
#define src r4
#define num r5
#define eight r8  /* must load a constant 8 into r8 */

BLAH, BLAH...

  dcbt  eight,src        /* prefetch the next cache line */
loop_top:
  dcba  eight,dst        /* allocate a cache line */
  lfd   f11,8,(src)
  lfd   f12,16,(src)
  lfd   f13,24,(src)
  lfdu  f14,32,(src)
  dcbi  r0,src           /* would like to discard the src data */
  dcbt  eight,src        /* prefetch the next cache line */
  stfd  f11,8,(dst)
  stfd  f12,16,(dst)
  stfd  f13,24,(dst)
  stfdu f14,32,(dst)
  dcbf  r0,dst           /* write back if needed, then invalidate */
bdnz  loop_top

BLAH, BLAH...

//////////////////////////////////////////////////////////////////


-- 
To UNSUBSCRIBE, email to debian-powerpc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: