Re: perspectives on 32 bit vs 64 bit
Adam Skutt wrote:
Helge Hafting wrote:
You can address more than 4GiB by using the always-unpopular
"segment" registers found on intel processors.
How? In protected-mode, they're in use as segement descriptor
selectors. Certain bits have specific meanings you cannot override,
as they're part of the memory protection mechanism.
Yes, so?
Simply have all but one segment "not present" and rely on the os
to trap access and remap the page tables whenever the code switch
segments.
Remap the tables to what? The address used for the lookup with a PTE
is 32-bit.
Sigh. All mechanisms that lets the os support more than 4GB for
several processes, can be used to support more than 4GB for a
single process as well. That is trivial, although also less efficient
than only supporting 4GB.
History:
8086: 16-bit adressing, limited to 64kB. But "segments" allowed addressing
of up to 1MB.
80286 protected mode: Still 16-bit adressing. The segment registers
are turned into "segment descriptor selectors". Now you can
address up to 16 MB.
80386 32-bit mode without extensions: 32-bit addressing, limited to 4GB.
there is also a set of page tables, so that virtual and physical
addresses may be different. Physical addresses still limited
to 32-bit.
80386 with PAE extensions: Still 32-bit addressing, but the page table
can remap the 32-bit virtual addresses into a bigger address
space.
The _simple_ use of this is to support more than 4GB, but only
4GB per process.
If you want more than 4GB for a _single_ process, then you need
to change page table mappings as needed as the process runs.
This can be done two ways:
1. The process explicitly calls into the memory management systems
to do this. That means accessing more than 4GB isn't
transparent,
you have to code this explicitly - it isn't transparent to
the programmer.
2. Use the segment descriptors. Now, each segment still can't
map more than 4GB, but the os can mark most segments as
"not present". Whenever the app reloads a segment register,
(i.e. trying to use a 48-bit pointer where the segment
descriptor
differs from the last pointer used) then the OS get a trap
similiar
to a page fault. The os can then look up which segment
descriptor
the app loaded, change the page tables accordingly, mark the
segment present, and let the code continue. Performance
may be
"reasonable" for code that stay inside the same 4GB most of the
time. Code that moves "all over the place" will take a
very big
performance hit as every memory operation "page faults" and
incur
the same overhead as a context switch. Still, this mode of
operation is transparent to the programmer. I.e. you can
recompile
ordinary portable code (assuming you have a compiler
supporting this
memory model) and have it work without change. (Assuming the
code makes no assumptions about the size or layout of a
pointer,
they are 48-bit and cannot be incremented with simple
arithmetic
only.) This is why I said "you don't want to do this".
Doable, but
hard and not efficient.
amd 64 mode: real 64-bit addressing, which is much easier to work with.
It is better than "80386 PAE" in the same way as "80386 32-bit"
was better than "80286 protexted mode". Addressing more than
4GB is now trivial. No tricks at all, as pointers are 64-bit.
Old code that don't make assumptions about pointer size may be
compiled without change.
Helge Hafting
Reply to: