[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: nvme SSD and poor performance



Thanks all.

I activated `# systemctl enable fstrim.timer` (thanks Linux-Fan).

But I do not think my issue is trim related after all. I have always a lot of I/O activities from jdb2 even just after booting and even when the computer is doing nothing for hours.

Here is an extended log of iotop where you can see jdb2 anormal activities: https://pastebin.com/eyGcGdUz

I cannot (yet) find what process is generating this activities. I tried to kill a lot of jobs seing in atop output with no results.

I don't think you have a significant performance problem, but
you are definitely feeling some pain -- so can you tell us more
about what feels slow? Does it happen during the ordinary course
of the day?

Program are slow to start. Sometimes there is a delay when I type (letters are displayed few second after typing). Apt unpack take forewer (5 hours to unpack packages when upgrading to bulleye).

The computer is a recent Dell precision desktop with i9-9900 as CPU, an NVIDIA GP107GL [Quadro P400] (and the GPU integrated to the CPU). The nvme SSD is supposed to be a decent one. This desktop is yet a lot slower than my (more basic) laptop.

Complete system info: https://pastebin.com/zaGVEpae




Le 18/08/2021 à 00:24, David Christensen a écrit :
On 8/17/21 2:54 AM, Pierre Willaime wrote:
 > Hi,
 >
 > I have a nvme SSD (CAZ-82512-Q11 NVMe LITEON 512GB) on debian stable
 > (bulleye now).
 >
 > For a long time, I suffer poor I/O performances which slow down a lot of
 > tasks (apt upgrade when unpacking for example).
 >
 > I am now trying to fix this issue.
 >
 > Using fstrim seems to restore speed. There are always many GiB which are
 > reduced :
 >
 >      #  fstrim -v /
 >      / : 236,7 GiB (254122389504 octets) réduits
 >
 > then, directly after :
 >
 >      #  fstrim -v /
 >      / : 0 B (0 octets) réduits
 >
 > but few minutes later, there are already 1.2 Gib to trim again :
 >
 >      #  fstrim -v /
 >      / : 1,2 GiB (1235369984 octets) réduits
 >
 >
 > /Is it a good idea to trim, if yes how (and how often)?/
 >
 > Some people use fstrim as a cron job, some other add "discard" option to
 > the /etc/fstab / line. I do not know what is the best if any. I also
 > read triming frequently could reduce the ssd life.
 >
 >
 >
 > I also noticed many I/O access from jbd2 and kworker such as :
 >
 >      # iotop -bktoqqq -d .5
 >      11:11:16     364 be/3 root        0.00 K/s    7.69 K/s  0.00 %
 > 23.64 % [jbd2/nvme0n1p2-]
 >      11:11:16       8 be/4 root        0.00 K/s    0.00 K/s  0.00 %
 > 25.52 % [kworker/u32:0-flush-259:0]
 >
 > The percentage given by iotop (time the thread/process spent while
 > swapping in and while waiting on I/O) is often high.
 >
 > I do not know what to do for kworker and if it is a normal behavior. For
 > jdb2, I have read it is filesystem (ext4 here) journal.
 >
 > I added the "noatime" option to /etc/fstab / line but it does not seem
 > to reduce the number of access.
 >
 > Regards,
 > Pierre
 >
 >
 > P-S: If triming it is needed for ssd, why debian do not trim by default?



On 8/17/21 6:14 AM, Pierre Willaime wrote:
Le 17/08/2021 à 14:02, Dan Ritter a écrit :
The first question is, how slow is this storage?


Here is a good article on using fio:
https://arstechnica.com/gadgets/2020/02/how-fast-are-your-disks-find-out-the-open-source-way-with-fio/

Thanks for the help.

Here are the output of fio tests.


Single 4KiB random write process test:

WRITE: bw=197MiB/s (207MB/s), 197MiB/s-197MiB/s (207MB/s-207MB/s), io=12.0GiB (12.9GB), run=62271-62271msec

https://pastebin.com/5Cyg9Xvt


16 parallel 64KiB random write processes (two different results, further tests are closer to the second than the first):

WRITE: bw=523MiB/s (548MB/s), 31.8MiB/s-33.0MiB/s (33.4MB/s-35.6MB/s), io=35.5GiB (38.1GB), run=63568-69533msec

WRITE: bw=201MiB/s (211MB/s), 11.9MiB/s-14.8MiB/s (12.5MB/s-15.5MB/s), io=14.3GiB (15.3GB), run=60871-72618msec


https://pastebin.com/XVpPpqsC
https://pastebin.com/HEx8VvhS


Single 1MiB random write process:

   WRITE: bw=270MiB/s (283MB/s), 270MiB/s-270MiB/s (283MB/s-283MB/s), io=16.0GiB (17.2GB), run=60722-60722msec

https://pastebin.com/skk6mi7M



Thank you for posting your fio(1) runs on pastebin -- it is far easier to comment on real data.  :-)


It would help if you told us:

1.  Make and model of computer (or motherboard and chassis, if DIY).

2.  Make and model of CPU.

3.  Quantity, make, and model of memory modules, and how your memory slots are populated.

4.  NVMe drive partitioning, formatting, space usage, etc..


STFW "CAZ-82512-Q11 NVMe LITEON 512GB", that looks like a decent desktop NVMe drive.


Looking at your fio(1) runs:

1. It appears that you held 6 parameters constant:

     --name=random-write
     --ioengine=posixaio
     --rw=randwrite
     --runtime=60
     --time_based
     --end_fsync=1

2.  It appears that you varied 4 parameters:

     run    bs    size    numjobs    iodepth
     1    4k    4g    1    1
     2    64k    256m    16    16
     3    64k    256m    16    16
     4    1m    16g    1    1

3.  Runs #2 and #3 had the same parameters, but had very different operation and results.  Why did run #3 layout 16 IO files, but run #2 did not?  I suspect that experimental prodedure and/or system loading were not the same during the two runs (?).

4.  The best way to eliminate system loading variances is to create your own Debian USB live stick.  At the "Choose software to install" screen, choose "SSH server" and "standard system utilities".  Once the system is running, install the tools you want.  Use script(1) to capture console sessions.

5.  Pick one set of benchmark tool parameters and do several runs.  You want to gain enough skill running the tool so that the runs have little variance.

6.  In addition to running the benchmark tool the same way every time, you need to do all the other steps the same way every time -- rebooting, clearing kernel buffers, erasing files, trimming SSD's, enabling/ disabling processes, whatever.

7.  Document your experimental procedure such that you and others can duplicate your work and compare results.

8.  For each set of runs, throw out the top third of results, throw out the bottom third of results, and use the middle third of results to make decisions.

9.  Pick one parameter to sweep, keep all other parameters fixed, and do a set of runs for each combination.  I would do --bs=64k, --size=20%, --numjobs equal to the number of cores in your CPU, and sweep --iodepth with a binary exponential progression -- e.g. 1, 2, 4, 8, 16, etc.. Once I found the sweet spot for --iodepth, I would fix --iodepth at that value, and sweep --bs from 4k to 1m.


David


References:

[1] https://www.harddrivebenchmark.net/hdd.php?hdd=CX2-8B512-Q11%20NVMe%20LITEON%20512GB

[2] https://www.notebookcheck.net/Lite-on-CX2-8B512-Q11-256-GB-Benchmarked.458645.0.html

[3] https://linux.die.net/man/1/fio

[4] https://linux.die.net/man/8/fstrim

[5] https://linux.die.net/man/1/script

[6] https://linux.die.net/man/1/df


--
Pierre Willaime - CNRS
Archives Henri Poincaré - Nancy


Reply to: