[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Why not 03 ?



Bottom line: the vectorisation provided -O3 can provide big speed ups to
some scientific programs, but it is ineffective on Debian because by
necessity it tells gcc to compile code for lowest common denominator CPU
which doesn't have the necessary instructions.

Ineffective on i386, but amd64 always has at least SSE2.

You can turn on -O3 (or -ftree-vectorize if you just want the vectorization) in a single package with DEB_CFLAGS_MAINT_APPEND and DEB_CXXFLAGS_MAINT_APPEND : https://wiki.debian.org/HardeningWalkthrough#My_package_builds_with_optimisation_flags_other_than_-O2.2C_e.g._-Os . However, given previous messages, please first check that your package actually benefits from it.

There is or was also a "hwcaps" mechanism for having multiple versions of a binary for different CPUs, but I've never tried to use it. For pocl (ITP #676504) the speed difference between -march=corei7-avx and plain amd64 is about 20%; I haven't measured it on i386, and other packages may be very different.


Reply to: