[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: pa-risc/linux abi



On Wed, Jun 23, 2004 at 08:20:22PM -0400, R Clint Whaley wrote:
> In what I think ought to be true in theory, if fr5L is the target
> of a fp op, you must wait at least FPU pipe length clock cycles
> before using fr5R or fr5L.  In practice,
> even in assembler, I never got close to peak until I used only one of the
> pair anytime it was the output of a muladd, so obviously my theoretical
> understanding is incomplete.

Are you maybe getting "interference" from D-cache misses?

AFAIK, the general registers (integer) have interlocks so
the depth of the pipeline is irrelevant. I don't know if
that's true for FP regs in general or specific CPU models.
Normally I'd expect how many cycles it takes to complete
a particular FOP to dictate when the FP reg (both left and right)
can be used again as the target or source for other ops.

One might look at fully optimized HP-UX acc output if all
else fails.

grant



Reply to: