[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: nvme SSD and poor performance



On 8/17/21 2:54 AM, Pierre Willaime wrote:
> Hi,
>
> I have a nvme SSD (CAZ-82512-Q11 NVMe LITEON 512GB) on debian stable
> (bulleye now).
>
> For a long time, I suffer poor I/O performances which slow down a lot of
> tasks (apt upgrade when unpacking for example).
>
> I am now trying to fix this issue.
>
> Using fstrim seems to restore speed. There are always many GiB which are
> reduced :
>
>      #  fstrim -v /
>      / : 236,7 GiB (254122389504 octets) réduits
>
> then, directly after :
>
>      #  fstrim -v /
>      / : 0 B (0 octets) réduits
>
> but few minutes later, there are already 1.2 Gib to trim again :
>
>      #  fstrim -v /
>      / : 1,2 GiB (1235369984 octets) réduits
>
>
> /Is it a good idea to trim, if yes how (and how often)?/
>
> Some people use fstrim as a cron job, some other add "discard" option to
> the /etc/fstab / line. I do not know what is the best if any. I also
> read triming frequently could reduce the ssd life.
>
>
>
> I also noticed many I/O access from jbd2 and kworker such as :
>
>      # iotop -bktoqqq -d .5
>      11:11:16     364 be/3 root        0.00 K/s    7.69 K/s  0.00 %
> 23.64 % [jbd2/nvme0n1p2-]
>      11:11:16       8 be/4 root        0.00 K/s    0.00 K/s  0.00 %
> 25.52 % [kworker/u32:0-flush-259:0]
>
> The percentage given by iotop (time the thread/process spent while
> swapping in and while waiting on I/O) is often high.
>
> I do not know what to do for kworker and if it is a normal behavior. For
> jdb2, I have read it is filesystem (ext4 here) journal.
>
> I added the "noatime" option to /etc/fstab / line but it does not seem
> to reduce the number of access.
>
> Regards,
> Pierre
>
>
> P-S: If triming it is needed for ssd, why debian do not trim by default?



On 8/17/21 6:14 AM, Pierre Willaime wrote:
Le 17/08/2021 à 14:02, Dan Ritter a écrit :
The first question is, how slow is this storage?


Here is a good article on using fio:
https://arstechnica.com/gadgets/2020/02/how-fast-are-your-disks-find-out-the-open-source-way-with-fio/

Thanks for the help.

Here are the output of fio tests.


Single 4KiB random write process test:

WRITE: bw=197MiB/s (207MB/s), 197MiB/s-197MiB/s (207MB/s-207MB/s), io=12.0GiB (12.9GB), run=62271-62271msec

https://pastebin.com/5Cyg9Xvt


16 parallel 64KiB random write processes (two different results, further tests are closer to the second than the first):

WRITE: bw=523MiB/s (548MB/s), 31.8MiB/s-33.0MiB/s (33.4MB/s-35.6MB/s), io=35.5GiB (38.1GB), run=63568-69533msec

WRITE: bw=201MiB/s (211MB/s), 11.9MiB/s-14.8MiB/s (12.5MB/s-15.5MB/s), io=14.3GiB (15.3GB), run=60871-72618msec


https://pastebin.com/XVpPpqsC
https://pastebin.com/HEx8VvhS


Single 1MiB random write process:

  WRITE: bw=270MiB/s (283MB/s), 270MiB/s-270MiB/s (283MB/s-283MB/s), io=16.0GiB (17.2GB), run=60722-60722msec

https://pastebin.com/skk6mi7M



Thank you for posting your fio(1) runs on pastebin -- it is far easier to comment on real data. :-)


It would help if you told us:

1.  Make and model of computer (or motherboard and chassis, if DIY).

2.  Make and model of CPU.

3. Quantity, make, and model of memory modules, and how your memory slots are populated.

4.  NVMe drive partitioning, formatting, space usage, etc..


STFW "CAZ-82512-Q11 NVMe LITEON 512GB", that looks like a decent desktop NVMe drive.


Looking at your fio(1) runs:

1. It appears that you held 6 parameters constant:

	--name=random-write
	--ioengine=posixaio
	--rw=randwrite
	--runtime=60
	--time_based
	--end_fsync=1

2.  It appears that you varied 4 parameters:

	run	bs	size	numjobs	iodepth
	1	4k	4g	1	1
	2	64k	256m	16	16
	3	64k	256m	16	16
	4	1m	16g	1	1

3. Runs #2 and #3 had the same parameters, but had very different operation and results. Why did run #3 layout 16 IO files, but run #2 did not? I suspect that experimental prodedure and/or system loading were not the same during the two runs (?).

4. The best way to eliminate system loading variances is to create your own Debian USB live stick. At the "Choose software to install" screen, choose "SSH server" and "standard system utilities". Once the system is running, install the tools you want. Use script(1) to capture console sessions.

5. Pick one set of benchmark tool parameters and do several runs. You want to gain enough skill running the tool so that the runs have little variance.

6. In addition to running the benchmark tool the same way every time, you need to do all the other steps the same way every time -- rebooting, clearing kernel buffers, erasing files, trimming SSD's, enabling/ disabling processes, whatever.

7. Document your experimental procedure such that you and others can duplicate your work and compare results.

8. For each set of runs, throw out the top third of results, throw out the bottom third of results, and use the middle third of results to make decisions.

9. Pick one parameter to sweep, keep all other parameters fixed, and do a set of runs for each combination. I would do --bs=64k, --size=20%, --numjobs equal to the number of cores in your CPU, and sweep --iodepth with a binary exponential progression -- e.g. 1, 2, 4, 8, 16, etc.. Once I found the sweet spot for --iodepth, I would fix --iodepth at that value, and sweep --bs from 4k to 1m.


David


References:

[1] https://www.harddrivebenchmark.net/hdd.php?hdd=CX2-8B512-Q11%20NVMe%20LITEON%20512GB

[2] https://www.notebookcheck.net/Lite-on-CX2-8B512-Q11-256-GB-Benchmarked.458645.0.html

[3] https://linux.die.net/man/1/fio

[4] https://linux.die.net/man/8/fstrim

[5] https://linux.die.net/man/1/script

[6] https://linux.die.net/man/1/df


Reply to: