[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: G3's, G4's, Altivec, DVD's, iBooks, and TiBooks



G. Branden Robinson writes:

> Are there people actively working on this?  The mood I was getting was
> that G3's weren't work the trouble.  I myself don't have much in the way
> of PowerPC assembly brains, or video processing mojo.

Some basic tips:

Sketch out data flow on a piece of paper. This helps with register
allocation and instruction ordering. Instructions depend on each
other in a tree-like way; draw arrows labeled with the dependency.
The dependency might be "memory", "r[]" (where "[]" represents an
unallocated register which you will fill in later), CR, CTR, LR...

Then try to interleave the independent streams of instructions as
much as possible. This helps you to avoid stalls caused by
instructions waiting for others to complete. Also try to mix up
your use of the different (int, FP, branch, mem) functional units.

Watch out for r0 in load/store instructions.

Get comfortable with the rlwimi and similar instructions.
With these you can rotate, mask, and copy bits.

Don't be too afraid of floating-point. There is a fused multiply-add
instruction (A=x*y+z) that is great for the matrix operations
commonly found in video processing. You can do at least one
operation per cycle, in parallel with the integer pipeline, with
a latency of 4 or 5 cycles.

Take advantage of the cache control instructions. They let you
prefetch, flush, zero, discard... Stick to the usual guidelines
for cache-aware programming as well, keeping your footprint small
and so on.

BTW, procmail can filter out duplicate messages.



Reply to: