Re: Using PPC asm (from Linux kernel) in xine
On 20 May, this message from Rogério Brito echoed through cyberspace:
> Anyway, one of the first things I see is that xine uses a
> function called xine_fast_memcpy, which is an alternative
> memcpy function possibly written in assembly (if available) or
> the standard glibc, if no other version is available, as is
> the case with PPC.
>
> I saw that the Linux kernel has an assembly implementation of
> memcopy and decided to try that instead of the glibc version.
Hmm.. kernel....
> After just a few adaptations and removals of unnecessary
> functions, I ended up with a string.S file with only
> cacheable_memcpy and memcpy, which seem to be the important
> parts of the file for my purposes.
>
> According to my tests, cacheable_memcpy is approximately 40%
> faster than the original glibc version, which is quite an
> improvement: with my tests, the glibc version took approx. 69s
> to run, while the cacheable_memcpy took only 42s (repeated
> many times to avoid noise errors).
You may want to try to use floating point load & stores in FPR
registers; that is typically faster than integer load/stores. The
performance gain may depend on the cacheability of the
source/destination memory, though, but it's definitely worth a try.
The kernel btw can't use that since floating point is a big no-no inside
kernel code.
Cheers
Michel
-------------------------------------------------------------------------
Michel Lanners | " Read Philosophy. Study Art.
23, Rue Paul Henkes | Ask Questions. Make Mistakes.
L-1710 Luxembourg |
email mlan@cpu.lu |
http://www.cpu.lu/~mlan | Learn Always. "
--
To UNSUBSCRIBE, email to debian-powerpc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Reply to: