[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: More help for ams statements needed (Was: [Help] Need help for architecture specific code)






On Fri, Sep 5, 2014 at 10:57 AM, Andrey Rahmatullin <wrar@debian.org> wrote:
On Fri, Sep 05, 2014 at 01:35:13PM +0200, Andreas Tille wrote:
>   2. i386 results in
>
>       ebwt.h: Assembler messages:
>       ebwt.h:1909: Error: invalid instruction suffix for `popcnt'
>       ebwt.h:1909: Error: invalid instruction suffix for `popcnt'
>       ebwt.h:1909: Error: invalid instruction suffix for `popcnt'
>       ebwt.h:1909: Error: invalid instruction suffix for `popcnt'
>       make[2]: *** [bowtie-build] Error 1
>
>      The relevant line in the code is:
>
>       $ grep -w -n  asm e*
>       ebwt.h:1909:            asm ("popcntq %[x],%[count]\n": [count] "=&r" (count): [x] "r" (x));
Unless someone investigates why GCC on i386 doesn't know this instruction,
I suggest compiling with POPCNT_CAPABILITY=0 on i386.

popcntq, or more specificially, the 'q' suffix, is a 64-bit only instruction. It does population count on 64-bit general purpose register (GPR). Obviously if you're running in 32-bit mode, you only have 32-bit GPRs. Something like this is better written with intrinsics that can handle details of "How do I get the popcount of a 64-bit value on this machine?". For example, 2 x 32-bit popcounts summed provide exactly the same answer as a single 64-bit popcount. Compilers can handle this stuff automatically, so let them. Upstream should really use a more general popcount() function and let the build architecture decide what implementation to use. If they really, really want an SSE 4.2 codepath, they should detect it, then use intrinsics, which can produce proper SSE 4.2 code for both 32-bit and 64-bit builds.


 

--
WBR, wRAR


Reply to: