[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#799905: gcc-4.7: generates broken SSE2 code for -ftree-vectorize/-O3 for unaligned dword access



On Thu, 24 Sep 2015, Henrique de Moraes Holschuh wrote:
> It still exists on gcc-4.9 in stable (gcc Debian 4.9.2-10).  I am going to 
> test on an unstable chroot and gcc-snapshot in a few moments.

I've tested it in unstable as well.  Here's a proper summary:

1.  Bug exists in gcc Debian 4.6.4-7 (latest 4.6 in unstable)
2a. Bug exists in gcc Debian 4.7.2-5 (default gcc for oldstable)
2b. Bug exists in gcc Debian 4.7.4-3 (latest 4.7 in unstable)
3.  Bug exists in gcc Debian 4.8.5-1 (latest 4.8 in unstable)
4a. Bug exists in gcc Debian 4.9.2-10 (default gcc for stable)
4b. Bug exists in gcc Debian 4.9.3-4 (latest 4.9 in unstable)
5.  Bug exists in gcc Debian 5.2.1-17 in unstable.

6.  Bug exists in gcc-snapshot gcc (Debian 20150913-1) 6.0.0 20150913
    (experimental) [trunk revision 227716]  -- latest in unstable

So, it is a long-time issue, and most likely still unfixed upstream.


THE ISSUE:

1. Happens when -ftree-vectorize and -msse2 are active.  It exists
   both in the 32-bit compiler/target (i386/i686), and 64-bit
   compiler/target (amd64).

2. CFLAGS=-O3 is enough to trigger it in amd64/x86-64, as SSE2 is
   enabled by default in amd64.

3. CFLAGS="-O3 -msse2" triggers it in i386 (i686)

4. Is related to vectorized non-byte access to *unaligned* data.  C code
   that does this is *expected to work* on i386/i686/amd64 (and, in fact, it
   was explicitly *optimized* by recent Intel and AMD processors to not be
   expensive).  gcc is not detecting that the data could be unaligned in
   _this specific case_ and generating SSE2 vector instructions of the
   "aligned" variant *in error*.

   The issue is present for pointers to 16-bit, 32-bit and 64-bit data
   types.

5. The issue will cause a general protection error on all Intel processors
   from the Pentium M (2005) to Core i5-2400, including Xeon X5550.  I do
   not have access to more recent processors to test.

Note that we have quite a number of packages that compile using -O3 in
Debian.  The majority will never attempt to derreference an unaligned
pointer, but some might (e.g. on corrupt data files, or on data files that
are very highly unaligned).


THE ATTACHED TEST-CASE:

1. Uses the number of arguments in the command line to select the
   data aligment.

2. It segfaults when the issue is triggered.

3. It is enough to run it without any parameters to show the issue.

ON AMD64:
gcc -O3 -o test_vector test_vector.c && ./test_vector
Segmentation fault

ON i386 (i686):
gcc -O3 -msse2 -o test_vector test_vector.c && ./test_vector
Segmentation fault


-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


Reply to: