On Thu, 2012-12-27 at 08:38 -0800, Jonathan Nieder wrote: > Abou Al Montacir wrote: > > > Hover, I assume we can save this extra code as soon as we don't loose > > data. > > That's fine with me. All you'd need to do is error out if there is > anything after the first stream. That would make it a conformant > decoder and prevent silent data loss, though it would mean busybox > couldn't read the XZ files pxz produces. Sure, > (Context: the spec permits single-stream decoders because there are > decoders in the wild that need to be very simple, like the one built > into the Linux kernel that unpacks the kernel and initramfs.) > > On the other hand, if busybox is to start decoding concatenated > streams (imitating the standard "xz" command), then the spec requires > also correctly implementing padding. This might sound rigid, but it > is important for interoperability --- without such requirements, > whenever you share XZ files there would be a lot of confusion about > whether it is valid and which implementations can and can't decode it. Agree > I think busybox upstream would agree that the spec shouldn't just be > ignored. You convinced me here. I admit you're right the we should either conform to spec or conform to spec, no choice there. I've fixed my patch and think that know it should really be conformant. I also attached some short samples to be tested. One of them only should fail to decode. I really thank you for your support and review as well as for your sense of details. I admit I've learned from you many things. Thanks,
Author: Abou Al Montacir <abou.almontacir@sfr.fr> Purpose: Fix decompression of multi stream XZ compressed files (Closes: bug#686502) --- busybox-1.20.0/archival/libarchive/decompress_unxz.c 2012-04-22 03:33:23.000000000 +0200 +++ busybox-1.20.0/debian/build/deb/archival/libarchive/decompress_unxz.c 2012-12-27 21:58:49.000000000 +0100 @@ -44,6 +44,7 @@ struct xz_dec *state; unsigned char *membuf; IF_DESKTOP(long long) int total = 0; + enum xz_ret r; if (!global_crc32_table) global_crc32_table = crc32_filltable(NULL, /*endian:*/ 0); @@ -59,12 +60,11 @@ strcpy((char*)membuf, HEADER_MAGIC); iobuf.in_size = HEADER_MAGIC_SIZE; } /* else: let xz code read & check it */ - - /* Limit memory usage to about 64 MiB. */ + /* First stream is identical to starting a new stream after finishing decoding an old one */ state = xz_dec_init(XZ_DYNALLOC, 64*1024*1024); + r = XZ_OK; while (1) { - enum xz_ret r; if (iobuf.in_pos == iobuf.in_size) { int rd = safe_read(src_fd, membuf, BUFSIZ); @@ -73,12 +73,36 @@ total = -1; break; } + /* No more bytes in stream. Stop */ + if (rd == 0) { + break; + } iobuf.in_size = rd; iobuf.in_pos = 0; } + if (r == XZ_STREAM_END) { + /* Eat padding. Stream never starts with zeros, and padding is 32 aligned */ + while ((iobuf.in_pos < iobuf.in_size) && (iobuf.in[iobuf.in_pos] == 0)) { + iobuf.in_pos += 1; + } + /* Reached end of buffer. Fill it again from stream */ + if (iobuf.in_pos == iobuf.in_size) { + continue; + } + if(iobuf.in_pos % 4){ + r = XZ_DATA_ERROR; + } + } // bb_error_msg(">in pos:%d size:%d out pos:%d size:%d", // iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size); - r = xz_dec_run(state, &iobuf); + /* Initialize decoder for new stream. Limit memory usage to about 64 MiB. */ + if (r == XZ_STREAM_END) { + state = xz_dec_init(XZ_DYNALLOC, 64*1024*1024); + r = XZ_OK; + } + if ((r == XZ_OK) || (r == XZ_UNSUPPORTED_CHECK)){ + r = xz_dec_run(state, &iobuf); + } // bb_error_msg("<in pos:%d size:%d out pos:%d size:%d r:%d", // iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size, r); if (iobuf.out_pos) { @@ -87,7 +111,9 @@ iobuf.out_pos = 0; } if (r == XZ_STREAM_END) { - break; + xz_dec_end(state); + /* Look for any other streams */ + continue; } if (r != XZ_OK && r != XZ_UNSUPPORTED_CHECK) { bb_error_msg("corrupted data"); @@ -95,7 +121,6 @@ break; } } - xz_dec_end(state); free(membuf); return total;
Attachment:
hel0000.xz
Description: application/xz
Attachment:
hel000lo.xz
Description: application/xz
Attachment:
hel0000lo.xz
Description: application/xz
Attachment:
hello.xz
Description: application/xz
Attachment:
hello1.xz
Description: application/xz
Attachment:
signature.asc
Description: This is a digitally signed message part