[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Architectures where unaligned access is (not) OK?



On 21/11/14 13:31, Bernhard R. Link wrote:
> Otherwise that memory
> might afterwards be regarded as lzo_memops_TU2_struct

lzo_memops_TU2_struct is declared with __attribute__((__may_alias__)),
so actually the right thing should be happening WRT aliasing in this case.

On 21/11/14 13:21, Thorsten Glaser wrote:
> • for i386 and especially amd64, all subarchitectures supported
>   by Debian/Linux jessie suffer so much from unaligned access,
>   speed-wise, that it’s worth the overhead of forcing aligned
>   access (i386, i486 maybe were not as badly affected)

I was hoping this statement was correct, because if it was, avoiding
unaligned accesses would be a clear win regardless, and the right thing
to do would be entirely uncontroversial.

Unfortunately, on my x86-64 laptop, my patched liblzo2 with
-DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as
the unpatched one for a simple test-case (uncompress
linux_3.17.orig.tar.xz to linux_3.17.orig.tar in a tmpfs, time lzop -c <
linux_3.17.orig.tar > /dev/null, repeat 3 times; results agree within 10%).

I'm trying out a slightly different approach: keeping the unaligned
accesses via casts like *(uint16_t *) on architectures where lzodefs.h
specifically allows them, but disabling the casts via
struct { char[n] } conditional on alignof(that struct) == 1, which seem
to be the problematic ones.

The CPUs for which lzodefs.h uses those casts are amd64, arm*
conditional on target CPU (so armel but not armhf in Debian terms),
arm64, cris, i386, m68k conditional on target CPU (__mc68020__ but not
__mcoldfire__), powerpc* if big-endian, and s390*.

    S


Reply to: