[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Only 3.6gb of 64gb RAM recognized by 64bit squeeze



On 5/15/2012 2:26 PM, Seyyed Mohtadin Hashemi wrote:
> On Tue, May 15, 2012 at 8:51 PM, Stan Hoeppner <stan@hardwarefreak.com>wrote:
> 
>> On 5/15/2012 12:26 PM, Seyyed Mohtadin Hashemi wrote:
>>> On Tue, May 15, 2012 at 4:30 AM, Henrique de Moraes Holschuh <
>> hmh@debian.org
>>>> wrote:
>>>
>>>> On Mon, 14 May 2012, Stan Hoeppner wrote:
>>>>> On 5/13/2012 7:02 PM, Henrique de Moraes Holschuh wrote:
>>>>>> On Fri, 11 May 2012, Seyyed Mohtadin Hashemi wrote:
>>>>>>> On 5/10/2012 1:16 PM, Stan Hoeppner wrote:
>>>>>>>> If this doesn't fix the issue, and memtest and other utils can see
>>>> all
>>>>>>>> 64GB just fine, then I'd say you're dealing with a BIOS bug.
>>>>>>>
>>>>>>> The very top of /var/log/dmesg has the kernel debug output about the
>>>> memory
>>>>>>> map.  It might well tell us very quickly who is the culprit, if the
>>>> user
>>>>>>> with the problem can post it for the best working case and the
>>>> non-working
>>>>>>> [    0.000000] e820 update range: 00000000e0000000 - 000000101f000000
>>>>>>> (usable) ==> (reserved)
>>>>>>> [    0.000000] WARNING: BIOS bug: CPU MTRRs don't cover all of
>> memory,
>>>>>>> losing 61936MB of RAM.
>>>>>>
>>>>>> There you have it.
>>>>>
>>>>> I'm not surprised I was correct WRT a BIOS bug, but I am a little
>>>>> embarrassed I didn't know and suggest this would be reported in dmesg.
>>>>> I admit I just don't see this very often--this being the 1st time
>>>>> actually seeing this WARNING.
>>>>
>>>> Well, it is the first time I've seen a BIOS screw it up so badly as to
>>>> have someone lose 61GiB of RAM over it.
>>>>
>>>>>> Any of the latest versions of the longterm kernels (2.6.32, 3.0), or
>>>>>> latest 3.2 should be able to repair MTRRs properly, but you have to
>>>>>> compile the kernel with that option enabled.  It might be already
>>>>>> available, but not enabled by default.  In that case, this might help
>>>>>> you:
>>>>>
>>>>> Yep.  In vanilla 3.2.6 it's selected by default in menuconfig, and you
>>>>> can't un-select it.
>>>>
>>>> We _really_ need to have that enabled by default on the Debian kernels
>>>> IMO, if we don't enable it already.
>>>>
>>>> --
>>>>  "One disk to rule them all, One disk to find them. One disk to bring
>>>>  them all and in the darkness grind them. In the Land of Redmond
>>>>  where the shadows lie." -- The Silicon Valley Tarot
>>>>  Henrique Holschuh
>>>>
>>>
>>> Thank you for the tips Henrique and Stan, unfortunately i don't have time
>>> to build/test new kernels this week because i have to finish my thesis. I
>>> will have time next week to look at it and report back the results.
>>
>> In that case you could simply install the latest backport kernel image
>> and see if that does the trick.  Should be quick 'n painless.
>>
>> Add to /etc/apt/sources.list
>> deb http://backports.debian.org/debian-backports squeeze-backports \
>> main contrib non-free
>>
>> $ aptitude update
>> $ aptitude -t squeeze-backports install linux-image-3.2.0-0.bpo.1-amd64
>> $ shutdown -r now
>>
>> Should take less than 5 minutes.
>>
>> --
>> Stan
>>
>>
> Funny you should mention that, I did actually try the exact kernel you
> mentioned yesterday - it did not go well, i got kernel panic. I didn't do
> many tests because i didn't have much time, i went back to the old kernel,
> and though i'm not happy with the situation the computer at least works and
> i can use the CPU to do calculations.

That Asus board just doesn't seem to want to cooperate.  At this point
I'd suggest swapping it for a Supermicro H8DGi.  IIRC you were already
prepared to send it back at the point I entered this thread, and that
you acquired this Asus because your preferred board wasn't available at
your preferred vendor.  My apologies for causing you to delay.  I
thought we might be able to get the Asus board working.

It's worth noting that Asus has a total of only *3* socket G34 mobos on
the market, only one is a standard form factor, and all 3 are dual
socket boards.  This shows that their volume is very low, and that Asus
doesn't have much experience with the socket G34 platform.  This tends
to explain the development immaturity of this board, as well as the
large number of BIOS updates since it was released, each one likely the
result of a single customer report of a problem, with the exception of
the update to support Bulldozer CPUs.

For comparison, Supermicro has 22 socket G34 boards on the market,
including single, dual, and quad socket boards, supporting from 128GB to
1TB RAM (32x32GB DDR-1066 RDIMMs) in max configurations.  They offer
many packaged servers based on these boards.  This demonstrates they are
shipping a large volume of G34 systems and have mature product.

This product maturity and quality is the reason I've been a fan of
Supermicro for 15 years.

-- 
Stan


Reply to: