Re: [idea]: Switch default compression from "xz" to "zstd" for .deb packages
M. Zhou wrote:
> Just one comment.
>
> Be careful if it bloats up our mirrors. Is there any estimate on
> the extra space cost for a full debian mirror?
>
> If we trade-off the disk space with decompression speed, zstd -19
> is not necessarily very fast. I did not benchmark, but it is slow.
Anecdotally, for "linux-image-6.5.0-1-amd64_6.5.3-1_amd64.deb" the
data.tar component takes 72 MB when compressed with xz -6, and 80 MB
when compressed with zstd -19, so about 10% larger with zstd.
This is specifically with multi-threaded compression. The behavior of
-T0 between xz and zstd is slightly different. (It looks like xz -T0
uses the number of threads supported by the CPU while zstd -T0 uses the
number of physical cores in the CPU.) The most direct multi-threaded
compression comparison between the two compressors was:
$ time xz -v -k -T0 -6 data.tar
data.tar (1/1)
100 % 71.9 MiB / 452.5 MiB = 0.159 21 MiB/s 0:21
Performance counter stats for 'xz -v -k -T0 -6 data.tar':
206,070.39 msec task-clock # 9.602 CPUs utilized
10,333 context-switches # 50.143 /sec
35 cpu-migrations # 0.170 /sec
73,502 page-faults # 356.684 /sec
925,351,049,292 cycles # 4.490 GHz
945,596,486,369 instructions # 1.02 insn per cycle
106,039,632,660 branches # 514.580 M/sec
6,702,750,057 branch-misses # 6.32% of all branches
21.460119122 seconds time elapsed
205.460711000 seconds user
0.567559000 seconds sys
Versus:
$ time zstd -T0 --auto-threads=logical -19 data.tar
data.tar : 17.54% ( 452 MiB => 79.3 MiB, data.tar.zst)
Performance counter stats for 'zstd -T0 --auto-threads=logical -19 data.tar':
293,120.46 msec task-clock # 8.649 CPUs utilized
21,754 context-switches # 74.215 /sec
78 cpu-migrations # 0.266 /sec
9,806 page-faults # 33.454 /sec
1,317,565,940,985 cycles # 4.495 GHz
1,430,204,017,430 instructions # 1.09 insn per cycle
266,246,644,005 branches # 908.318 M/sec
5,762,322,300 branch-misses # 2.16% of all branches
33.889831439 seconds time elapsed
292.501337000 seconds user
0.567560000 seconds sys
So, 71.9 MiB in 21 seconds for xz -6 versus 79.3 MiB in 34 seconds for
zstd -19. In other words, xz is 91% the size and 63% the wallclock time
of zstd here.
zstd decompression is much, much faster than xz decompression, but
apparently zstd does not support multi-threaded decompression while xz
does. Here xz decompresses in about 120% the wallclock time of zstd
(about 0.6 seconds for xz vs 0.5 seconds for zstd) but is only able to
perform that well by occupying most of the CPU:
$ time xzcat -v -T12 data.tar.xz > /dev/null
data.tar.xz (1/1)
100 % 71.9 MiB / 452.5 MiB = 0.159
Performance counter stats for 'xzcat -v -T12 data.tar.xz':
5,434.51 msec task-clock # 8.720 CPUs utilized
1,187 context-switches # 218.419 /sec
22 cpu-migrations # 4.048 /sec
24,119 page-faults # 4.438 K/sec
24,311,239,346 cycles # 4.473 GHz
21,196,398,588 instructions # 0.87 insn per cycle
2,841,057,067 branches # 522.781 M/sec
296,751,808 branch-misses # 10.45% of all branches
0.623224953 seconds time elapsed
5.304562000 seconds user
0.127532000 seconds sys
$ time zstdcat -v -T12 data.tar.zst > /dev/null
Warning : decompression does not support multi-threading
Performance counter stats for 'zstdcat -v -T12 data.tar.zst':
559.03 msec task-clock # 1.075 CPUs utilized
4,245 context-switches # 7.593 K/sec
5 cpu-migrations # 8.944 /sec
1,032 page-faults # 1.846 K/sec
2,519,428,855 cycles # 4.507 GHz
5,752,165,946 instructions # 2.28 insn per cycle
943,510,461 branches # 1.688 G/sec
17,026,238 branch-misses # 1.80% of all branches
0.520219563 seconds time elapsed
0.518084000 seconds user
0.044177000 seconds sys
If xzcat is restricted to a single core the performance is much worse
(about 3.5 seconds for xz vs 0.5 seconds for zstd), although I
understand from another post in the thread that dpkg performs
multi-threaded xz decompression.
This is on an ordinary "Intel(R) Xeon(R) E-2236 CPU @ 3.40GHz" CPU which
is a four year old, 6 core, 12 thread processor.
--
Robert Edmonds
edmonds@debian.org
Reply to: