Re: nvme SSD and poor performance

To: debian-user@lists.debian.org
Subject: Re: nvme SSD and poor performance
From: Pierre Willaime <pierre.willaime@univ-lorraine.fr>
Date: Fri, 20 Aug 2021 10:50:31 +0200
Message-id: <[🔎] 08627d80-1550-75bd-3364-7ab37f22670e@univ-lorraine.fr>
In-reply-to: <[🔎] 8bf6d23a-7abd-8613-9988-782c59624c00@holgerdanske.com>
References: <[🔎] 9b0c4e1b-8629-b401-a0bf-ed1f4c0f826a@univ-lorraine.fr> <[🔎] 20210817120252.kwrdzqwxg7thtnjm@randomstring.org> <[🔎] b54aeab4-02f5-f8ad-b76f-dbaeafa26168@univ-lorraine.fr> <[🔎] 8bf6d23a-7abd-8613-9988-782c59624c00@holgerdanske.com>

Thanks all.

I activated `# systemctl enable fstrim.timer` (thanks Linux-Fan).

But I do not think my issue is trim related after all. I have always alot of I/O activities from jdb2 even just after booting and even whenthe computer is doing nothing for hours.

Here is an extended log of iotop where you can see jdb2 anormalactivities: https://pastebin.com/eyGcGdUz

I cannot (yet) find what process is generating this activities. I triedto kill a lot of jobs seing in atop output with no results.

I don't think you have a significant performance problem, but
you are definitely feeling some pain -- so can you tell us more
about what feels slow? Does it happen during the ordinary course
of the day?

Program are slow to start. Sometimes there is a delay when I type(letters are displayed few second after typing). Apt unpack take forewer(5 hours to unpack packages when upgrading to bulleye).

The computer is a recent Dell precision desktop with i9-9900 as CPU, anNVIDIA GP107GL [Quadro P400] (and the GPU integrated to the CPU). Thenvme SSD is supposed to be a decent one. This desktop is yet a lotslower than my (more basic) laptop.


Complete system info: https://pastebin.com/zaGVEpae




Le 18/08/2021 à 00:24, David Christensen a écrit :

On 8/17/21 2:54 AM, Pierre Willaime wrote:
 > Hi,
 >
 > I have a nvme SSD (CAZ-82512-Q11 NVMe LITEON 512GB) on debian stable
 > (bulleye now).
 >
 > For a long time, I suffer poor I/O performances which slow down a lot of
 > tasks (apt upgrade when unpacking for example).
 >
 > I am now trying to fix this issue.
 >
 > Using fstrim seems to restore speed. There are always many GiB which are
 > reduced :
 >
 >      #  fstrim -v /
 >      / : 236,7 GiB (254122389504 octets) réduits
 >
 > then, directly after :
 >
 >      #  fstrim -v /
 >      / : 0 B (0 octets) réduits
 >
 > but few minutes later, there are already 1.2 Gib to trim again :
 >
 >      #  fstrim -v /
 >      / : 1,2 GiB (1235369984 octets) réduits
 >
 >
 > /Is it a good idea to trim, if yes how (and how often)?/
 >
 > Some people use fstrim as a cron job, some other add "discard" option to
 > the /etc/fstab / line. I do not know what is the best if any. I also
 > read triming frequently could reduce the ssd life.
 >
 >
 >
 > I also noticed many I/O access from jbd2 and kworker such as :
 >
 >      # iotop -bktoqqq -d .5
 >      11:11:16     364 be/3 root        0.00 K/s    7.69 K/s  0.00 %
 > 23.64 % [jbd2/nvme0n1p2-]
 >      11:11:16       8 be/4 root        0.00 K/s    0.00 K/s  0.00 %
 > 25.52 % [kworker/u32:0-flush-259:0]
 >
 > The percentage given by iotop (time the thread/process spent while
 > swapping in and while waiting on I/O) is often high.
 >
 > I do not know what to do for kworker and if it is a normal behavior. For
 > jdb2, I have read it is filesystem (ext4 here) journal.
 >
 > I added the "noatime" option to /etc/fstab / line but it does not seem
 > to reduce the number of access.
 >
 > Regards,
 > Pierre
 >
 >
 > P-S: If triming it is needed for ssd, why debian do not trim by default?



On 8/17/21 6:14 AM, Pierre Willaime wrote:
Le 17/08/2021 à 14:02, Dan Ritter a écrit :
The first question is, how slow is this storage?


Here is a good article on using fio:
https://arstechnica.com/gadgets/2020/02/how-fast-are-your-disks-find-out-the-open-source-way-with-fio/
Thanks for the help.

Here are the output of fio tests.


Single 4KiB random write process test:
WRITE: bw=197MiB/s (207MB/s), 197MiB/s-197MiB/s (207MB/s-207MB/s),io=12.0GiB (12.9GB), run=62271-62271msec
https://pastebin.com/5Cyg9Xvt
16 parallel 64KiB random write processes (two different results,further tests are closer to the second than the first):
WRITE: bw=523MiB/s (548MB/s), 31.8MiB/s-33.0MiB/s (33.4MB/s-35.6MB/s),io=35.5GiB (38.1GB), run=63568-69533msec
WRITE: bw=201MiB/s (211MB/s), 11.9MiB/s-14.8MiB/s (12.5MB/s-15.5MB/s),io=14.3GiB (15.3GB), run=60871-72618msec
https://pastebin.com/XVpPpqsC
https://pastebin.com/HEx8VvhS


Single 1MiB random write process:
WRITE: bw=270MiB/s (283MB/s), 270MiB/s-270MiB/s (283MB/s-283MB/s),io=16.0GiB (17.2GB), run=60722-60722msec
https://pastebin.com/skk6mi7M
Thank you for posting your fio(1) runs on pastebin -- it is far easierto comment on real data. :-)
It would help if you told us:

1.  Make and model of computer (or motherboard and chassis, if DIY).

2.  Make and model of CPU.
3. Quantity, make, and model of memory modules, and how your memoryslots are populated.
4.  NVMe drive partitioning, formatting, space usage, etc..
STFW "CAZ-82512-Q11 NVMe LITEON 512GB", that looks like a decent desktopNVMe drive.
Looking at your fio(1) runs:

1. It appears that you held 6 parameters constant:

     --name=random-write
     --ioengine=posixaio
     --rw=randwrite
     --runtime=60
     --time_based
     --end_fsync=1

2.  It appears that you varied 4 parameters:

     run    bs    size    numjobs    iodepth
     1    4k    4g    1    1
     2    64k    256m    16    16
     3    64k    256m    16    16
     4    1m    16g    1    1
3. Runs #2 and #3 had the same parameters, but had very differentoperation and results. Why did run #3 layout 16 IO files, but run #2did not? I suspect that experimental prodedure and/or system loadingwere not the same during the two runs (?).
4. The best way to eliminate system loading variances is to create yourown Debian USB live stick. At the "Choose software to install" screen,choose "SSH server" and "standard system utilities". Once the system isrunning, install the tools you want. Use script(1) to capture consolesessions.
5. Pick one set of benchmark tool parameters and do several runs. Youwant to gain enough skill running the tool so that the runs have littlevariance.
6. In addition to running the benchmark tool the same way every time,you need to do all the other steps the same way every time -- rebooting,clearing kernel buffers, erasing files, trimming SSD's, enabling/disabling processes, whatever.
7. Document your experimental procedure such that you and others canduplicate your work and compare results.
8. For each set of runs, throw out the top third of results, throw outthe bottom third of results, and use the middle third of results to makedecisions.
9. Pick one parameter to sweep, keep all other parameters fixed, and doa set of runs for each combination. I would do --bs=64k, --size=20%,--numjobs equal to the number of cores in your CPU, and sweep --iodepthwith a binary exponential progression -- e.g. 1, 2, 4, 8, 16, etc.. OnceI found the sweet spot for --iodepth, I would fix --iodepth at thatvalue, and sweep --bs from 4k to 1m.
David


References:
[1]https://www.harddrivebenchmark.net/hdd.php?hdd=CX2-8B512-Q11%20NVMe%20LITEON%20512GB
[2]https://www.notebookcheck.net/Lite-on-CX2-8B512-Q11-256-GB-Benchmarked.458645.0.html
[3] https://linux.die.net/man/1/fio

[4] https://linux.die.net/man/8/fstrim

[5] https://linux.die.net/man/1/script

[6] https://linux.die.net/man/1/df


--
Pierre Willaime - CNRS
Archives Henri Poincaré - Nancy

Reply to:

Follow-Ups:
- Re: nvme SSD and poor performance
  - From: Christian Britz <cbritz@t-online.de>
- Re: nvme SSD and poor performance
  - From: Linux-Fan <Ma_Sys.ma@web.de>
- Re: nvme SSD and poor performance
  - From: "Alexander V. Makartsev" <avbetev@gmail.com>
- Re: nvme SSD and poor performance
  - From: David Christensen <dpchrist@holgerdanske.com>

References:
- nvme SSD and poor performance
  - From: Pierre Willaime <pierre.willaime@univ-lorraine.fr>
- Re: nvme SSD and poor performance
  - From: Dan Ritter <dsr@randomstring.org>
- Re: nvme SSD and poor performance
  - From: Pierre Willaime <pierre.willaime@univ-lorraine.fr>
- Re: nvme SSD and poor performance
  - From: David Christensen <dpchrist@holgerdanske.com>

Prev by Date: Re: Relatively boring bullseye upgrade reports
Next by Date: Re: Reading of release notes (was Re: Still on stretch, getting ready for bullseye)
Previous by thread: Re: nvme SSD and poor performance
Next by thread: Re: nvme SSD and poor performance
Index(es):
- Date
- Thread