[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Extracting indiviual files or directories from XYZ.tar.xz - Possible?



On 2025-08-10 10:09:32 +0200, Nicolas George wrote:
> Vincent Lefevre (HE12025-08-10):
> > Big files have several blocks. See my example. And that's with
> > the default options.
> 
> What you showed is not default options, especially not for archives.
> Nobody does “tar -cf out.tar … ; xz out.tar”, only “tar -c … | xz >
> out.tar.xz”, or -J that does the same internally.

There are still several blocks:

qaa% tar cJf archive.tar.xz PROGRAMME-FFC-2024.pdf
qaa% xz -lv archive.tar.xz                        
archive.tar.xz (1/1)
  Streams:           1
  Blocks:            2
  Compressed size:   38.7 MiB (40530728 B)
  Uncompressed size: 39.1 MiB (41000960 B)
  Ratio:             0.989
  Check:             CRC64
  Stream Padding:    0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         2               0               0        40530728        41000960  0.989  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0        24961388        25165824  0.992  CRC64
         1         2        24961400        25165824        15569292        15835136  0.983  CRC64

And this -J option more likely offers the opportunity to create
blocks or streams that cover only single (big) files. That's even
possible if an external xz is used (several streams could thus be
created):

qaa% echo foo | xz - > file.xz
qaa% echo bar | xz - >> file.xz
qaa% xzcat file.xz
foo
bar
qaa% xz -lv file.xz
file.xz (1/1)
  Streams:           2
  Blocks:            2
  Compressed size:   136 B
  Uncompressed size: 8 B
  Ratio:             ---
  Check:             CRC64
  Stream Padding:    0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0              68               4    ---  CRC64            0
         2         1              68               4              68               4    ---  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0              36               4  9.000  CRC64
         2         1              80               4              36               4  9.000  CRC64

To optimize partial extraction, an external xz could still be used,
but tar would need to used the stream and block structures to obtain
the size of the compressed and uncompressed data so that data could
be skipped instead of being uncompressed.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)


Reply to: