I think you misunderstand the compiler option, which is fine, because it applies to Solaris. Because accessing unaligned memory raises a hardware error which forces a kernel context switch, you can mitigate the risk of this by assuming that any k-aligned object is actually only j-aligned, j < k, and perform k / j loads instead. This reduces performance compared to a full k-sized load, but it's better than nothing. You correctly understood this part -- HOWEVER, if you STILL get an unaligned load error, the kernel can transparently handle this for you if you opt-in. Most likely if you're developing new code, you will opt-out, because it's better to find performance issues like this while still developing, but if you inherit x86 code that does terrible things, then it may save developer time to just save "whatever, just emulate it". In such a case, you will generate a hardware trap, but the process will not get a SIGBUS -- the difference is subtle but important. Again, this is a feature of Solaris/SPARC -- using -xmemalign[n]i
The point of my previous post was to show that the default Linux behavior (for me) was (in kernel mode) to simple emulate the load with multiple smaller loads. I wouldn't be surprised if there was a way to enable unaligned load fixups automatically in Linux -- probably a kernel option or runtime configuration. Short version -> Not getting SIGBUS is NOT proof that unaligned loads are not happening.
Patrick