Re: 2.6.0-test6

To: "David S. Miller" <davem@redhat.com>
Cc: <debian-sparc@lists.debian.org>
Subject: Re: 2.6.0-test6
From: "Paul" <paul@techcenter3000.com>
Date: Fri, 3 Oct 2003 08:02:43 -0500
Message-id: <[🔎] 050301c389ae$a28e1080$0702a8c0@tfam.org>
References: <[🔎] 001501c388e6$ce583a10$0702a8c0@tfam.org><[🔎] 20031002130030.GR19687@phunnypharm.org><008e01c38928$47da4590$0702a8c0@tfam.org><20031002204425.GA474@phunnypharm.org><[🔎] 00ed01c3892f$b73b9770$0702a8c0@tfam.org><[🔎] 20031002224839.GB474@phunnypharm.org><[🔎] 015301c38940$f8da6380$0702a8c0@tfam.org><[🔎] 20031003012354.0ecc8364.davem@redhat.com><[🔎] 045501c389a0$83e8a360$0702a8c0@tfam.org> <[🔎] 20031003050649.7c75e298.davem@redhat.com>

Ok, I'll happily concede that 64 bit Sparcs are faster (in general) then 32
bit sparcs. BUT, I think that has more to do with better/more
[cache|architecture|MHz|optimizations|processor design] than with simple
'64-bittedness'. To put it bluntly, of course a 1GHz 64 bit processor is
faster than a 500MHz 32 bit processor, but clock for clock is the issue
(IMHO).

I assume that Sparcs don't have braindead Itanium-ness issues with 32 bit
code. I could be wrong, but it doesn't seem like it. Empirically, the fact
that userland is 32 bit supports this.

Now, the kernel may be the only thing that executes 64 bit code.
If so, there are two issues that would harm performance:
1) The kernel talks to the hardware. So, let's take a 16 gig IDE drive in
PIO mode (simply because it IS a shoddy interface). IDE can do at best a 16
bit transfer per cycle, only 16 bits are available on the physical
interface. Now, the kernel, it wants to use 64 bits. So, either we 'pad' 48
bits, or we do 4 transfers and combine them into one 64 bit 'word'. This
operation would happen at LEAST twice per operation, once for 'seek LBA' and
once for 'read sector'. Apply this to to (most) common 32 bit PCI bus. Apply
this to, even worse, printer I/O. (Ok, so we do offload a lot of this onto
sub-processors, but the point remains the same)
How many applications do you run that do not EVER access ANY hardware? I
can't think of any except a no-op loop. even a number crunching app has to
talk to memory, and the kernel controls that.

> But the only thing which executes 64-bit code is the operating system
> kernel, not the actual applications.
What would you say if I told you ONLY the kernel was buggy? Wouldn't that
affect userland? :)

I think our issue here may come from the fact that UltraSparc I and II at
least are capable of running perfectly happily with 32bit kernels. (I have
these procs, among others) So why wouldn't I keep the raw speed of a fast
64bit processor that happens to be capable of running either 32 _or_ 64 bit
kernels AND wipe out the overhead I believe is incurred by the 64 bit kernel
by running a 32 bit kernel.

 I'm honestly not sure if the 64 bit kernel requirement for UltraSparc III's
mentioned below is a hardware driven issue or a decision by Sun to disallow
32 bit kernels on III's.
(Please, can anyone shed light on this?)

2) Userland is mostly 32 bit, if not completely. So what happens when I make
a 32 bit call to the kernel that wants to receive 64 bits, and return 64
Bits? The kernel has to either assume 32 bitness and pad in at least one
direction, or decide whether it is 32 or 64 and pad, which would add even
more overhead.

Let me throw this in from:
http://www.sunmanagers.org/pipermail/summaries/2002-October/003914.html
(Thanks Google!)

1) Probably the best answer, benchmark in the appropriate environment with
the
appropriate applications. Your mileage may vary.
2) Majority reported that running a 32-bit application on the 64-bit kernel
requires more cycles to do alignment and byte packing operations. Thus it is
reasonable that 32-bit code would run slower on the 64-bit kernel than the
32-bit kernel.
3) Following #2's lead, the cache is half-sized in terms of number of words
on
the 64-bit kernel since the word size is 64 bits instead of 32 bits. This
can
cause more cache misses and degrade performance.
4) One reported floating point operations double in performance when the
application is compiled for 64-bit and (obviously) running on the 64-bit
kernel versus the app's 32/32 counterpart.
5) The 64-bit kernel allows the address space to grow beyond 4GB and this
can
enhance performance by allowing more operations in the VM space rather than
forcing the operation to disk. Oracle hash join was given as an example.

I hadn't thought about the half sized cache issue(#3), but heck, doubling
your cache rarely HURTS performance.

I'm not saying that 32 bit is right for everyone, I am saying that for what
I do, it would give me a noticeable and measurable boost in speed.

On the flip side of this, I have to say, if you're doing heavy graphics
work, almost without a doubt 64 bit is for you. (Gee, the bus width of UPA
for example is..taadaa-> 64 bits..theoretically one transfer)

Again, if I'm wrong, please explain to me how and/or why. I'm learning every
day, and I'll learn from anybody that can/will teach me.

Thanks
Paul

P.S. This COULD all be a plot on my part to make sure my poor SS20 will
still be supported in 2.6! GRIN!

----- Original Message ----- 
From: "David S. Miller" <davem@redhat.com>
To: "Paul" <paul@techcenter3000.com>
Cc: <debian-sparc@lists.debian.org>
Sent: Friday, October 03, 2003 7:06 AM
Subject: Re: 2.6.0-test6

> On Fri, 3 Oct 2003 06:21:38 -0500
> "Paul" <paul@techcenter3000.com> wrote:
>
> > Let's see, the first and most obvious issue would be transferring 64 bit
> > pointers per cycle instead of two 32 bit pointers in the same cycle.
Sure,
> > if I had a large enough data set, it would speed things up greatly.
>
> But the only thing which executes 64-bit code is the operating system
> kernel, not the actual applications.
>
> The applications are fully 32-bit.  The 64-bit operating system kernel
> supports running both 32-bit and 64-bit applications.
>
> The 64-bit kernel doesn't have anything to do with processor speed.
>
> But it is a fact that every 64-bit _PROCESSOR_ that Sun ever sold
> is faster than basically all of the 32-bit processors they ever sold.
>
> So whatever perceived or real difference in performance you think you'll
> take due to the fact the sparc64 systems use a 64-bit kernel is entirely
> nullified by how much faster the sparc64 chips are both in terms of MHZ,
> instruction parallelism, and memory bandwidth.
>
>
> -- 
> To UNSUBSCRIBE, email to debian-sparc-request@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact
listmaster@lists.debian.org
>

Reply to:

Follow-Ups:
- Re: 2.6.0-test6
  - From: "David S. Miller" <davem@redhat.com>
- Re: 2.6.0-test6
  - From: "Matthew French" <mfrench@telkomsa.net>

References:
- 2.6.0-test6
  - From: "Paul" <paul@techcenter3000.com>
- Re: 2.6.0-test6
  - From: Ben Collins <bcollins@debian.org>
- Re: 2.6.0-test6
  - From: "Paul" <paul@techcenter3000.com>
- Re: 2.6.0-test6
  - From: Ben Collins <bcollins@debian.org>
- Re: 2.6.0-test6
  - From: "Paul" <paul@techcenter3000.com>
- Re: 2.6.0-test6
  - From: "David S. Miller" <davem@redhat.com>
- Re: 2.6.0-test6
  - From: "Paul" <paul@techcenter3000.com>
- Re: 2.6.0-test6
  - From: "David S. Miller" <davem@redhat.com>

Prev by Date: Re: Are Sunblade 1000s slow?
Next by Date: Re: 2.6.0-test6
Previous by thread: Re: 2.6.0-test6
Next by thread: Re: 2.6.0-test6
Index(es):
- Date
- Thread