[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Memory errors on new memory in new system

On 29/11/12 02:38 AM, Marc Shapiro wrote:
On 11/28/2012 09:24 AM, Gary Dale wrote:
I'm afraid I can't really answer that. All I can do is repeat that when MemTest86+ reports an error, it is a good indication that you have a problem. I've seen this on systems where MemTest86+ reported only a few problems but the computer locked up intermittently in use. Replacing the memory with ones that MemTest86+ passed cured the lockups.

I also have a motherboard that was running reliably for 2 years then started locking up. MemTest86+ reported that the memory was OK but Klaus Knopper's suggestion that it was a chipset problem seemed reasonable since the problem occurred during operations where both memory and disk access were high. The manufacturer meanwhile replaced the board three times with repaired boards all of which displayed the same problem. Their repair testing tests components in isolation, which usually is OK but in this case failed to trigger the real problem.

The problem seems to be more common on modern hardware than on older systems. I hadn't seen it before but now have seen it on a couple different motherboards. Slowing down the memory access cures it.

Well, it's not just Memtest86+ that is reporting errors. I decided to get a second opinion, as it were, and installed memtester from the Debian repository. I let it run through 5 cycles of its tests. I have included the results of the first loop below. The other loops were similar. None of the five loops found any errors in the first 9 tests. The 'Checkerboard' test found errors in 2 of the 5 tests and only the first test found errors in the 'Waking zeros' test. As you can see, I was testing 7GB on an 8GB system. This was running from inside an xterm while I was cruising the web. As you can see from the results of running 'free' while the test was going, I had under 80GB (less than 1% of my total memory) free, so the full memory was getting a workout. I am just having difficulty with the idea that there are this may memory errors throughout the range of the RAM and I keep right on working with no apparent difficulties? Others that I have talked to, who have actually had memory go bad on them (I never have, before) say that it is extremely obvious and normal operations are not possible. I believe the term he used was that "the OS completely wigged out!"

Do I really have bad memory? Or is this some other kind of aberration. Sunday is the end of my in-store return period. After that I have to ship things back to manufacturers which is a problem if I don't really know were the trouble lies.

Of course bad memory doesn't always make the computer lock up completely. MemTest86+ and memtester clearly are able to trigger errors without the system screeching to a halt. Conversely, an entire stick can fail leading to your system running with less memory than you installed. In this case, you may not notice any problems except for slower performance.

When it's only certain patterns that fail, a system failure would depend on that particular byte experiencing that particular pattern shift in a manner that causes the program to noticeably fail. A bit being flipped in a data area, for example, may lead to a subtle data error, such as a glitch in a video, or it may not even show up - such as a bit in an unused portion of a buffer.

My advice is that if MemTest86+ shows you have a memory problem, believe it. Even if it doesn't show errors, that doesn't mean that there aren't memory access issues.

Reply to: