Re: Intel Core2Duo (T7400)
On Mon, Nov 12, 2007 at 11:51:49PM +0100, Gabor Gombas wrote:
> AFAIK SSE is also available in 32-bit mode so that is no reason why
> x86_64 should be faster.
Available yes, but not default. You would have to recompile all
libraries and applications that use floating point and tell them all to
use sse math instead of x87 math. With 64bit that is the default and
x87 is never used. So yes it is possible, but it is not trivial and has
compatibility issues. Of course you also loose 80bit floats when you go
to sse math and end up with only the standard 64bit (I don't think
anything other than x87 ever had 80bit floats).
> Again this is a reason why new 32-bit processors are faster than old
> 32-bit processors, but not a reason why x86_64 mode should be faster
> than i386 mode on the same processor.
I think the new instructions may be a big part. After all things that
were added along the way on 32bit x86 chips couldn't be used by default
in compilers since not all 32bit x86 cpus would have the instruction.
By being a new demarcation point the 64bit cpus can say all those new
instructions are mandetory and hence the compiler can always use them.
There have been some useful stuff added over the years and while you
always had the option of compiling code for only a certain level of CPU
it was not the default. After all there have been linux distributions
compiled so that they only run on 686 and higher CPUs at the loss of
backwards compatibility, and they did claim in gave performance
> I don't have numbers so I can't really argue, but that is the largest
> visible difference between the two modes.
I think the set of instructions that are required by the architecture is
much more important. It would be interesting to compare what the
performance is in 32bit mode between gcc compiling for 686 with sse and
all that and gcc compiling for 64bit limited to the same register count.
I have also seen things that say some operations are way faster in 64bit
mode due to having new instructions to do them in much less time than
you could with the 32bit instruction set (I think 64bit interger
multiplication was one of the ones I read about, where in 32bit mode it
takes twice as long as in 64bit mode since to 64bit mode it is a native
operation while 32bit mode has to do two instructions to perform the
same thing or something like that). So certainly new instructions can
have helped a lot.
> You said Sparc/Solaris; I don't know the current top-of-line configs but
> several hundred gigabytes of memory should not present a problem for a
> really high-end Sun server and as you said most of the userspace is
> still 32-bit...
Most of user space yes. They run 64bit capable kernel and a 32bit
mainly userspace (to avoid the performance hit of pointer size increase
and all that and since for the most part sparc doesn't gain any speed
going to 64bit mode, it is just a change in memory model). The few
programs that have a need for a larger memory space run in 64bit mode.
> The kernel of course must be 64-bit, but that's not a problem even if
> 64-bit mode is significanlty slower since applications do not spend too
> much time in the kernel (and if they do that's almost certainly a bug).
Well I run a 64bit kernel with 32bit user space for x86, since I mainly
use 32bit stuff (that's what I develop for in my job, since we use
embedded x86 chips), but it means I can still run 64bit programs when I
want to try something, and I have a chroot with all 64bit stuff in it to
> But back to the original issue: x86_64 is _NOT_ faster because it is
> using 64-bit addressing - quite the contrary, that alone would have made
> it slower than 32-bit mode. But AMD also did a lot of other
> modifications that they _could_ have also enabled in 32-bit mode but
> they simply choose not to, because otherwise they could not have sold
> their 64-bit processors.
Certainly you could have added any new instructions to 32bit mode, and
you could have added the extra registers, and you could have declared
sse the default floating point and eliminated mmx and all that, but you
would essentially have had to declare a new operating mode so the OS and
applications would know it was not the same as previous 32bit modes (the
p3 did that when it added SSE as far as I recall, and I believe the
pentium managed to avoid adding a new mode for MMX by reusing the
floating point registers so that the OS didn't need to know about any
new registers, but the application had to pick either MMX or floating
point and couldn't do both easily). Given the state of the computing
world, adding a new mode without adding 64bit address space at the same
time would have made no sense, and being AMD it probably wouldn't have
been seen as important enough to bother supporting if they hadn't gone
all the way.