[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: SSD Optimization - Crucial CT1000MX500SSD1



On 10/3/22 09:23, piorunz wrote:
On 02/10/2022 21:33, David Christensen wrote:
On 10/2/22 06:19, Marcelo Laia wrote:
# cat /etc/debian_version ; uname -a

bookworm/sid
Linux marcelo 5.19.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.19.11-1
(2022-09-24) x86_64 GNU/Linux


Please install Debian Stable.

Why would he?
I have exactly the same SSD (two of them) in my machine, on Debian
Testing, drives in BTRFS Raid1 mode, everything works perfect. But I
have good SATA cables.
OS version has nothing to do with cabling errors in SSD drive SMART log.
He may as well be using DOS, Windows FreeBSD, any Linux - cabling errors
must never happen.

  uname -a
Linux ryzen 5.19.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.19.11-1 (2022-09-24) x86_64 GNU/Linux

$ sudo smartctl /dev/sda --all | grep "Device
Model\|SATA_Interfac\|DMA_CRC_Error"
Device Model:     CT1000MX500SSD1
183 SATA_Interfac_Downshift 0x0032   100   100   000    Old_age   Always
       -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always
       -       0

$ sudo smartctl /dev/sdb --all | grep "Device
Model\|SATA_Interfac\|DMA_CRC_Error"
Device Model:     CT1000MX500SSD1
183 SATA_Interfac_Downshift 0x0032   100   100   000    Old_age   Always
       -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always
       -       0


Even if you and the OP ran identical OS instances (e.g. clones), I do not believe you two have the same make and model computers. Therefore, different code paths will be executed -- e.g. device drivers. So, the OP's computer may be hitting a bug that your computer does not.


I am applying a trouble-shooting strategy -- change one variable, apply a stimulus, and measure the result. If the result is the same as it was before, then the result is unlikely to be related to the variable and/or change. But if the result is different, then the result is likely to be related to the variable and/or change.


Of course, this is all premised upon devising a stimulus that reliably reproduces the result. When my HDD's/SSD's were having SATA cable and/or drive rack problems, reading 10 GB from them typically produced at least one error.


When the OP read 10 GB of the SSD using the d-i rescue shell, he was applying a stimulus after changing the variable "OS instance". The result was different. Therefore, the SATA UDMA CRC errors are related to changing the OS instance.


But, the above experiment has significant flaws (here are few; I expect there are more):

1.  We cannot reproduce the OP's hardware and software.

2. We do not know what Debian installer the OP used (but we could obtain it if he told us).

2. The stimulus read from the SSD. The UDMA CRC errors may only occur during writes.

3.The SMART reports indicate 38 UDMA CRC errors for 1296000877 Logical Sectors Written and 801097450 Logical Sectors Read. So, an average of 1 error per 5.52E+7 sectors. The test read 2.05E+7 sectors. That might be too few sectors.

4. Similarly, for Number of Read Commands -- 1 error per 4.43E+5 commands vs. 1.02E+4 test commands.

5. The Debian installer rescue shell is single-user (single-process?), but the UDMA errors were seen during multi-user operation (SMP). If the SATA UDMA errors are caused by concurrency/ parallel execution, the d-i rescue shell environment may not be capable of reproducing the error.


If the OP installs Debian Stable on the SSD, runs the 10 GB sequential read test, uses the system interactively, and the SATA UDMA errors are not seen for a some period of time (a week?), then I would be reasonably confident the problem was the SSD Debian Testing instance. But if the errors persist, then we will have to think up another hypothesis and experiment.


David


Reply to: