Re: Only 3.6gb of 64gb RAM recognized by 64bit squeeze
On 5/30/2012 8:55 AM, Seyyed Mohtadin Hashemi wrote:
> On Tue, 2012-05-29 at 15:48 -0500, Stan Hoeppner wrote:
>> On 5/29/2012 4:08 PM, Seyyed Mohtadin Hashemi wrote:
>>> On Tue, 2012-05-15 at 21:26 +0200, Seyyed Mohtadin Hashemi wrote:
>>>> On Tue, May 15, 2012 at 8:51 PM, Stan Hoeppner
>>>> <stan@hardwarefreak.com> wrote:
>>>> On 5/15/2012 12:26 PM, Seyyed Mohtadin Hashemi wrote:
>>>> > On Tue, May 15, 2012 at 4:30 AM, Henrique de Moraes Holschuh
>>>> <hmh@debian.org
>>>> >> wrote:
>>>> >
>>>> >> On Mon, 14 May 2012, Stan Hoeppner wrote:
>>>> >>> On 5/13/2012 7:02 PM, Henrique de Moraes Holschuh wrote:
>>>> >>>> On Fri, 11 May 2012, Seyyed Mohtadin Hashemi wrote:
>>>> >>>>> On 5/10/2012 1:16 PM, Stan Hoeppner wrote:
>>>> >>>>>> If this doesn't fix the issue, and memtest and other
>>>> utils can see
>>>> >> all
>>>> >>>>>> 64GB just fine, then I'd say you're dealing with a BIOS
>>>> bug.
>>>> >>>>>
>>>> >>>>> The very top of /var/log/dmesg has the kernel debug
>>>> output about the
>>>> >> memory
>>>> >>>>> map. It might well tell us very quickly who is the
>>>> culprit, if the
>>>> >> user
>>>> >>>>> with the problem can post it for the best working case
>>>> and the
>>>> >> non-working
>>>> >>>>> [ 0.000000] e820 update range: 00000000e0000000 -
>>>> 000000101f000000
>>>> >>>>> (usable) ==> (reserved)
>>>> >>>>> [ 0.000000] WARNING: BIOS bug: CPU MTRRs don't cover
>>>> all of memory,
>>>> >>>>> losing 61936MB of RAM.
>>>> >>>>
>>>> >>>> There you have it.
>>>> >>>
>>>> >>> I'm not surprised I was correct WRT a BIOS bug, but I am a
>>>> little
>>>> >>> embarrassed I didn't know and suggest this would be
>>>> reported in dmesg.
>>>> >>> I admit I just don't see this very often--this being the
>>>> 1st time
>>>> >>> actually seeing this WARNING.
>>>> >>
>>>> >> Well, it is the first time I've seen a BIOS screw it up so
>>>> badly as to
>>>> >> have someone lose 61GiB of RAM over it.
>>>> >>
>>>> >>>> Any of the latest versions of the longterm kernels
>>>> (2.6.32, 3.0), or
>>>> >>>> latest 3.2 should be able to repair MTRRs properly, but
>>>> you have to
>>>> >>>> compile the kernel with that option enabled. It might be
>>>> already
>>>> >>>> available, but not enabled by default. In that case,
>>>> this might help
>>>> >>>> you:
>>>> >>>
>>>> >>> Yep. In vanilla 3.2.6 it's selected by default in
>>>> menuconfig, and you
>>>> >>> can't un-select it.
>>>> >>
>>>> >> We _really_ need to have that enabled by default on the
>>>> Debian kernels
>>>> >> IMO, if we don't enable it already.
>>>> >>
>>>> >> --
>>>> >> "One disk to rule them all, One disk to find them. One
>>>> disk to bring
>>>> >> them all and in the darkness grind them. In the Land of
>>>> Redmond
>>>> >> where the shadows lie." -- The Silicon Valley Tarot
>>>> >> Henrique Holschuh
>>>> >>
>>>> >
>>>> > Thank you for the tips Henrique and Stan, unfortunately i
>>>> don't have time
>>>> > to build/test new kernels this week because i have to finish
>>>> my thesis. I
>>>> > will have time next week to look at it and report back the
>>>> results.
>>>>
>>>>
>>>> In that case you could simply install the latest backport
>>>> kernel image
>>>> and see if that does the trick. Should be quick 'n painless.
>>>>
>>>> Add to /etc/apt/sources.list
>>>> deb http://backports.debian.org/debian-backports
>>>> squeeze-backports \
>>>> main contrib non-free
>>>>
>>>> $ aptitude update
>>>> $ aptitude -t squeeze-backports install
>>>> linux-image-3.2.0-0.bpo.1-amd64
>>>> $ shutdown -r now
>>>>
>>>> Should take less than 5 minutes.
>>>>
>>>> --
>>>> Stan
>>>>
>>>>
>>>> Funny you should mention that, I did actually try the exact kernel you
>>>> mentioned yesterday - it did not go well, i got kernel panic. I didn't
>>>> do many tests because i didn't have much time, i went back to the old
>>>> kernel, and though i'm not happy with the situation the computer at
>>>> least works and i can use the CPU to do calculations.
>>>
>>>
>>> Hi Stan,
>>>
>>> I RMA'd the MB and with the replacement I received I am able to run the
>>> 3.2 kernel and all installed RAM is usable. However, I have to use
>>> "noapic irqpoll acpi=force" boot flags.
>>
>> Needing some boot flags with some main boards isn't uncommon. And in
>> fact using various boot flags used to be (maybe still is) needed to get
>> Linux VMs running properly on VMWare ESX, specifically the system clock.
>> So the boot flags are just a bare metal hardware issue.
>>
>>> I did have a small problem, sometimes I would get "RAM R/W test fail" at
>>> BIOS POST. I had done extensive memtest on the DIMMs earlier so I only
>>> tested if the individual DIMMs could POST, only one gave the "RAM R/W
>>> test fail". After removing the faulty DIMM + a healthy DIMM the system
>>> works smoothly.
>>
>> What replacement board board did you get? Another ASUS or a SuperMicro?
>>
>
> I got another ASUS (same model), the only SuperMicro I could get at the
> vendor was Supermicro H8DGU-F or quad CPU MBs - non of which I wanted.
So, are you certain the original ASUS board was defective? You may want
to update the subject with SOLVED: and describe the fix. Glad it's
working now. :)
--
Stan
Reply to: