Hi, On 13.6.2025 4.36, Finn Thain wrote:
And therein lies the rub -- to identify those workloads which should be measured and to afford each one a suitable weight in your decision making.
It's not just workload affecting the results; compiler version, optimization options [1], workload & kernel config options and sometimes even unrelated code changes [2], can affect how given instruction sequence settles into cache.
> That's why this was always political. I'd rather keep things technical and fact-based.Whatever testing is done, the one wider conclusion that *can* be drawn from it, is that if there's a noticeable performance difference, such differences are possible also in other workloads.
(Very large difference could indicate also functional issues, e.g. bug in given compiler build code generation. That's why it's important to have good tooling for pinpointing what exactly is causing the difference.)
- Eero[1] One example is -Os vs. -O2 having 2x perf impact on Geert's experimental Atari drm fb code. That would completely hide any impact from alignment.
[2] with more complex cache hierarchies than on m68k, adding or removing code elsewhere can impact cache line alignment on other parts of the resulting binary. Not a concern for m68k though.