[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [xine-user] [ANN] PowerPC Assembly Patch



On May 25 2002, Andrew Patrikalakis wrote:
> Hello all,

	Hi, Andrew.

> With all the recent talk of use of assembly on the PowerPC, I came
> up with a patch to use assembly versions of memcpy. It's about 35%
> faster. Here is a sample of the memcpy speed test (which also now
> works):

	Thanks for helping get xine get better for PowerPC.

	And also thanks for looking into my earlier message and
	discovering which part of the code was giving the relocation
	error (I just read your patch).

> Benchmarking memcpy methods (smaller is better):
> 	glibc memcpy() : 136
> 	ppcasm_memcpy() : 137
> 	ppcasm_cacheable_memcpy() : 88
> xine: using ppcasm_cacheable_memcpy()
> (The lower time resolution is because I'm using times(NULL) in rdtsc())

	We can also look into getting an extra version of memcpy that
	makes the transfers with floating point registers as some
	people suggested on the Debian PowerPC mailing list.

	People there said that using floating point registers (which
	are 64 bits large) instead of general purpose registers (32
	bits each) may improve things.

> I'd like to know how much it helps PPC users, so keep this list up
> to date with the results. (Also, if my patch breaks other
> platforms...) It gave my Mac laptop the little boost it needed to
> play some media I have.

	Well, with the faster memcpy and with XFree86 4.2.0 (with DMA
	enabled), I can watch a DVD here with linearblend
	deinterlacing (coded in C) enabled and there are about 15% of
	frames skipped, which while still not perfect, is quite an
	improvement in face of the situation some weeks ago.

	BTW, I am using gcc-3.0 to compile xine-libs and I added some
	extra options to the configure script (-mfused-madd,
	-mcpu=750, -mtune=750, -O9).

	The next points of improvement (which may not be as immediate
	as using the memcpy being discussion) may be coding the idct,
	motion compensation and deinterlacing in assembly also.

	I guess that I'll heave to learn a bit more before I can get
	to these, but with the help of other people, things could go
	faster.

> Just so you know, the methods I used are from the linux kernel
> version 2.4.18 (arch/ppc/lib/string.S)

	Yes, that's what I tried in my earlier message, but I wasn't
	as succesful as you were.

	Your patch had a problem, though and I had to apply a part of
	it by hand. You might perhaps want to remake it and send to
	the xine developers so that it can be included for xine
	release 0.9.10, which should be near.

> Andrew Patrikalakis


	Thanks for your help, Roger...

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
  Rogério Brito - rbrito@iname.com - http://www.ime.usp.br/~rbrito/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


-- 
To UNSUBSCRIBE, email to debian-powerpc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: