Bug#799905: gcc-4.7: generates broken SSE2 code for -ftree-vectorize/-O3 for unaligned dword access
On Thu, 24 Sep 2015, Henrique de Moraes Holschuh wrote:
> It still exists on gcc-4.9 in stable (gcc Debian 4.9.2-10). I am going to
> test on an unstable chroot and gcc-snapshot in a few moments.
I've tested it in unstable as well. Here's a proper summary:
1. Bug exists in gcc Debian 4.6.4-7 (latest 4.6 in unstable)
2a. Bug exists in gcc Debian 4.7.2-5 (default gcc for oldstable)
2b. Bug exists in gcc Debian 4.7.4-3 (latest 4.7 in unstable)
3. Bug exists in gcc Debian 4.8.5-1 (latest 4.8 in unstable)
4a. Bug exists in gcc Debian 4.9.2-10 (default gcc for stable)
4b. Bug exists in gcc Debian 4.9.3-4 (latest 4.9 in unstable)
5. Bug exists in gcc Debian 5.2.1-17 in unstable.
6. Bug exists in gcc-snapshot gcc (Debian 20150913-1) 6.0.0 20150913
(experimental) [trunk revision 227716] -- latest in unstable
So, it is a long-time issue, and most likely still unfixed upstream.
THE ISSUE:
1. Happens when -ftree-vectorize and -msse2 are active. It exists
both in the 32-bit compiler/target (i386/i686), and 64-bit
compiler/target (amd64).
2. CFLAGS=-O3 is enough to trigger it in amd64/x86-64, as SSE2 is
enabled by default in amd64.
3. CFLAGS="-O3 -msse2" triggers it in i386 (i686)
4. Is related to vectorized non-byte access to *unaligned* data. C code
that does this is *expected to work* on i386/i686/amd64 (and, in fact, it
was explicitly *optimized* by recent Intel and AMD processors to not be
expensive). gcc is not detecting that the data could be unaligned in
_this specific case_ and generating SSE2 vector instructions of the
"aligned" variant *in error*.
The issue is present for pointers to 16-bit, 32-bit and 64-bit data
types.
5. The issue will cause a general protection error on all Intel processors
from the Pentium M (2005) to Core i5-2400, including Xeon X5550. I do
not have access to more recent processors to test.
Note that we have quite a number of packages that compile using -O3 in
Debian. The majority will never attempt to derreference an unaligned
pointer, but some might (e.g. on corrupt data files, or on data files that
are very highly unaligned).
THE ATTACHED TEST-CASE:
1. Uses the number of arguments in the command line to select the
data aligment.
2. It segfaults when the issue is triggered.
3. It is enough to run it without any parameters to show the issue.
ON AMD64:
gcc -O3 -o test_vector test_vector.c && ./test_vector
Segmentation fault
ON i386 (i686):
gcc -O3 -msse2 -o test_vector test_vector.c && ./test_vector
Segmentation fault
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
Reply to: