[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [idea]: Switch default compression from "xz" to "zstd" for .deb packages



Hi,

On Sat, Sep 16, 2023 at 10:31:20AM +0530, Hideki Yamane wrote:
>  Today I want to propose you to change default compression format in .deb,
>  {data,control}.tar."xz" to ."zst".
> 
>  I want to hear your thought about this.

I am not very enthusiastic about this idea. I skip over those arguments
already raised by others and add one that I haven't seen thus far. zstd
is quite optimized for 64bit CPUs and for amd64 in particular. amd64 is
the only architecture for which zstd provides a hufmann implementation
in assembly.

> ## More CPUs
> 
>  2012: ThinkPad L530 has Core i5-3320M (2 cores, 4 threads)
>  2023: ThinkPad L15 has Core i5-1335U (10 cores, 12 threads)
> 
>  https://www.cpubenchmark.net/compare/817vs5294/Intel-i5-3320M-vs-Intel-i5-1335U
>   - i5-3320M: single 1614, multicore 2654
>   - i5-1335U: single 3650, multicore 18076 points.

While the majority of CPUs in active deployments is amd64, I'd also like
to see numbers for 32bit CPUs and non-x86 ones. While I personally find
the trade-off by zstd fit for a number of my use cases, I was also
surprised just how slow it decompresses on armhf.

I found some arm board with some linux kernel package sized 36MB.

 algo     | compressed size | decompression time
----------+-----------------+-------------------
 xz       |     36MB        |  14.7s
 zstd     |     52MB        |   5.2s
 zstd -9  |     48MB        |   5.2s
 zstd -11 |     47MB        |   5.4s
 zstd -19 |     41MB        |   5.7s

Not as slow as I remembered apparently, but it still has a more than 10%
size overhead. The size ratio is consistent with Robert Edmond's
numbers, but we no longer see that 10-fold speedup. And this did not
look at decompression memory requirements.

I am decompressing a *lot* of .debs (dedup.d.n, multiarch hinter,
crossqa.d.n, dumat). All of these applications would benefit from zstd
compressed .debs in terms of decompression speed. Yet, that has never
been the bottleneck to me. To me, download speed matters more and
swapping out a 1GBit link for a faster one isn't that easy.

I'd vote against this given the data we have now.

Can we defer the discussion until there are more convincing numbers?

Helmut


Reply to: