Bug#512050: gcc-4.3: pessimizes function without SSE intrinsics
Brian,
Upstream closed this bug report as invalid:
"That is because the early complete unrolling comes and unrolls the
loop so the autovectorizer does not have a loop to work on anymore.
If I increase it to be 16 instead of 4, the loop is vectorizer.
So the original testcase is invalid as two things: aliasing and
alignment. Aliasing because out could overlap with in1/in2, restrict
fixes that. And then the alignment comes into play because there is
no way to say the incoming arguments are 16 byte aligned."
"t.c:11: note: cost model: Adding cost of checks for loop versioning
to treat misalignment.
t.c:11: note: cost model: Adding cost of checks for loop versioning
aliasing."
--
Martin Michlmayr
http://www.cyrius.com/
Reply to: