On Wed, Mar 02, 2005 at 05:16:50PM +0100, Florian Weimer wrote:
> * Paul Brossier:
>
> > Two questions:
> > - can anyone spot what in these codelets causes the non-pic ?
>
> Tables of constants are addressed directly, not in some IP-relative
> way.
thanks, i thought this could be sort of a problem. so iiuc:
KP707106781KP707106781: .float +0.707106781186547524400844362104849039284835938, +0.707106781186547524400844362104849039284835938
is ok, but
pfmul KP707106781KP707106781, %mm3
is not pic compliant, and should be replaced with something like
pfmul KP707106781KP707106781(%ebx), %mm3
given i didn't know anything about assembly yesterday, how far am i?
> > - how much can it hurt to have this non-pic in fftw3 ?
>
> It shouldn't matter much if all PIC code is grouped together in the
> binary because few pages have to be copied in this case. PIC code
> itself is always slower, significantly so if the code is using all
> available integer registers.
I did some tests running 1024 points ffts on an AMD 700, and the
difference between with and without --enable-k7 was roughly a
drop of 1.5%. The drop could become more important on K7 with
smaller number of points, where k7/3dnow optimisations come in.
If there is no objections, and as suggested by upstream, i will
disable the k7 optimisations in order to make sure that
libfftw3f.so remains PIC compliant, despite a little slower.
cheers, piem
Attachment:
signature.asc
Description: Digital signature