[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Laptop randomly reboots



On 01/20/17 18:04, Sam Smith wrote:
I'll try to keep this short. I bought a used Lenovo T520 back in May. It
had the motherboard with nvidia GPU. Because it sucks power and doesn't
really have good suspend/resume support, I bought a used mother board
off of ebay that only had intel integrated graphics and I swapped it
out. All was well and I had installed in it 8gb +4gb of ram.

It ran like that for about 8 weeks before I bought an 8gb stick and
stuck that in, so then I had 8gb + 8gb of ram. After about 6 weeks of
running like, it randomly rebooted overnight. I shrugged it off and
thought maybe the power went out or something (even though it had a
battery in it). But then about 2 weeks later it did it again..and then
two weeks after that. So I pulled out the new 8gb stick I had put in and
let it run with just one 8gb stick. It ran like that for about 10 weeks
without a problem. I put the old 4gb stick in just for fun, bringing it
back to the original 8gb + 4gb configuration. But about 2 weeks later it
rebooted again. At that point I bought a matched 16gb kit (8gb + 8gb)
from new egg that seemed to come recommended from google searching for
compatible ram for this model. But just a couple of days ago (about 3
weeks after installing it), it rebooted by itself.

I am kind of at a loss here now. I can buy another motherboard and swap
it out again, but that takes a few hours and I don't feel like doing it.
The cooling and thermal stuff is all good on the laptop,I've ran prime95
and video encoding for hours and it is fine (temps stay below 80* at
least, normal usage is 40-55*). I've also ran memtest for a few hours.

What I find weird is that the machine suddenly reboots. At least a few
years ago, ram issues would just lead to a kernel panic screen. But with
this, the machine is just like someone pulled the plug and rebooted it.
I started to wonder if there is some built in watchdog somewhere that
will reboot the machine if it hangs, but I can't tell? Other than that,
if this is the kernel that is rebooting the machine, is there any way I
can get it to dump some info somewhere before it fully reboots? Before I
go through the pain of swapping the board again, I'd just like to really
know that this is a hardware issue and not the kernel detecting
something and just choosing to reboot...

Memory errors are more common that we'd like to believe:


http://www.zdnet.com/article/dram-error-rates-nightmare-on-dimm-street/?_escaped_fragment_=#!

    A two-and-a-half year study of DRAM on 10s of thousands Google
    servers found DIMM error rates are hundreds to thousands of times
    higher than thought -- a mean of 3,751 correctable errors per DIMM
    per year.

    Non-ECC DRAM is more common Most DIMMs don’t include ECC because it
    costs more. Without ECC the system doesn’t know a memory error has
    occurred.

    Bad news Besides error rates much higher than expected - which is
    plenty bad - the study found that error rates were motherboard, not
    DIMM type or vendor, dependent. This means that some popular mobos
    have poor EMI hygiene. Route a memory trace too close to noisy
    component or shirk on grounding layers and instant error problems.

    Other interesting findings For all platforms they found that 20% of
    the machines with errors make up more than 90% of all observed
    errors on that platform. There be lemons out there!


Without ECC memory, there's no way to know if you really have a memory problem.


Looking at the data sheet for your computer:


http://www.lenovo.com/shop/americas/content/pdf/system_data/t520_tech_specs.pdf


It covers three variants:

1.  ThinkPad® T520 4243 (Onsite)

2.  ThinkPad® T520 4243 (Optimus) - Onsite

3.  ThinkPad® T520i 4239 (TopSeller)


All three say:

    Memory

        8GB max7 / PC3-10600 1333MHz DDR3, non-parity,
        dual-channel capable, two 204-pin SO-DIMM sockets

    See footnotes for more detailed information


I can't find the footnotes.


It appears that you have exceeded the manufacturer's specifications.


Why do you believe installing two 8 GB memory modules will work in this computer?


What operating system are you running?


Have you installed any software other than official Debian binary packages?


Some ideas:

1. Put identifying marks/ sequential numbers on items that otherwise look the same -- memory modules, SATA cables, adapter cards, etc..

2. Keep detailed notes in a plain text file using a method that allows access from multiple computers. (I use CVS over SSH, with the repository on my file server.)

3. If everything is per the specifications, and module X in slot A and module Y in slot B results in problems, swapping the modules sometimes solves the problem. This has worked for me more than once.

4. I once built a computer with what appeared to be an infrequent memory problem. memtest86 ran for over a day before finding one error.

5. I once upgrade a computer from 2 @ 256 MB memory modules to 2 @ 1 GB, and encountered memory problems. I had another machine with the same motherboard, 2 @ 512 MB memory modules, and no problems. I swapped memory between the two machines and both computers worked fine.


David


Reply to: