[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#613321: linux-image-amd64: Please enable 'memtest' option for all linux kernels



On Friday 02 December 2011 07:59:02 Jonathan Nieder wrote:
> > Having said that I don't know if it's sensible to add to Debian as I
> > didn't test runtime and binary size overhead.

Binary size overhead is really negligible.

> No opinion on that from me.  It does seem a shame that many kinds of
> faults would be likely to be missed:

Let's not fall into discussion about the quality of this feature. It doesn't 
matter. It doesn't have to be perfect to be included. Personally I think it is 
good enough for inclusion as is, because it do catch some errors. 

ECC is not perfect either, and MEMTEST appears to be better or at least as 
useful as ECC.

 
> That seems like the bigger potential cost.  When someone runs into
> corruption that the memtest option did not catch, what can we say to
> such a person?  (It would be easier if there were a manpage for kernel
> parameters and a culture such that everyone read it before enabling
> them.)

Such person, if capable of activating MEMTEST with boot-time argument to 
kernel, may have already read something about it. 
We can't take responsibility for that person's decisions (or expectations). 

Use case for MEMTEST is not to catch all errors but to minimize damage.
ECC doesn't catch all errors but it is better to have it to avoid massive data 
corruption due to bad RAM.

Vast majority of computers out there do not have any form of ECC and we're not 
allowing users to have any protection against RAM errors because someone have 
unexplained reluctance regarding MEMTEST inclusion.

Is my test case not good enough?

I've seen faulty RAM on servers running 24/7 for years when fault was 
discovered only after hardware upgrade. Data corruption was probably happening 
for a very long time and impact is very difficult to understand.

MEMTEST can be helpful for this scenario and it can be helpful for notebooks 
and desktop PCs where people work with sensitive data and do not routinely 
check their memory on weekly/monthly basis. (who does?)

 
> I should have just been quiet. :)

No worries, you're thoughts are welcome. :)




Reply to: