On Wed, Mar 02, 2005 at 05:16:50PM +0100, Florian Weimer wrote: > * Paul Brossier: > > > Two questions: > > - can anyone spot what in these codelets causes the non-pic ? > > Tables of constants are addressed directly, not in some IP-relative > way. thanks, i thought this could be sort of a problem. so iiuc: KP707106781KP707106781: .float +0.707106781186547524400844362104849039284835938, +0.707106781186547524400844362104849039284835938 is ok, but pfmul KP707106781KP707106781, %mm3 is not pic compliant, and should be replaced with something like pfmul KP707106781KP707106781(%ebx), %mm3 given i didn't know anything about assembly yesterday, how far am i? > > - how much can it hurt to have this non-pic in fftw3 ? > > It shouldn't matter much if all PIC code is grouped together in the > binary because few pages have to be copied in this case. PIC code > itself is always slower, significantly so if the code is using all > available integer registers. I did some tests running 1024 points ffts on an AMD 700, and the difference between with and without --enable-k7 was roughly a drop of 1.5%. The drop could become more important on K7 with smaller number of points, where k7/3dnow optimisations come in. If there is no objections, and as suggested by upstream, i will disable the k7 optimisations in order to make sure that libfftw3f.so remains PIC compliant, despite a little slower. cheers, piem
Attachment:
signature.asc
Description: Digital signature