On Thu, 2012-12-27 at 08:38 -0800, Jonathan Nieder wrote: > Abou Al Montacir wrote: > > > Hover, I assume we can save this extra code as soon as we don't loose > > data. > > That's fine with me. All you'd need to do is error out if there is > anything after the first stream. That would make it a conformant > decoder and prevent silent data loss, though it would mean busybox > couldn't read the XZ files pxz produces. Sure, > (Context: the spec permits single-stream decoders because there are > decoders in the wild that need to be very simple, like the one built > into the Linux kernel that unpacks the kernel and initramfs.) > > On the other hand, if busybox is to start decoding concatenated > streams (imitating the standard "xz" command), then the spec requires > also correctly implementing padding. This might sound rigid, but it > is important for interoperability --- without such requirements, > whenever you share XZ files there would be a lot of confusion about > whether it is valid and which implementations can and can't decode it. Agree > I think busybox upstream would agree that the spec shouldn't just be > ignored. You convinced me here. I admit you're right the we should either conform to spec or conform to spec, no choice there. I've fixed my patch and think that know it should really be conformant. I also attached some short samples to be tested. One of them only should fail to decode. I really thank you for your support and review as well as for your sense of details. I admit I've learned from you many things. Thanks,
Author: Abou Al Montacir <abou.almontacir@sfr.fr>
Purpose: Fix decompression of multi stream XZ compressed files
(Closes: bug#686502)
--- busybox-1.20.0/archival/libarchive/decompress_unxz.c 2012-04-22 03:33:23.000000000 +0200
+++ busybox-1.20.0/debian/build/deb/archival/libarchive/decompress_unxz.c 2012-12-27 21:58:49.000000000 +0100
@@ -44,6 +44,7 @@
struct xz_dec *state;
unsigned char *membuf;
IF_DESKTOP(long long) int total = 0;
+ enum xz_ret r;
if (!global_crc32_table)
global_crc32_table = crc32_filltable(NULL, /*endian:*/ 0);
@@ -59,12 +60,11 @@
strcpy((char*)membuf, HEADER_MAGIC);
iobuf.in_size = HEADER_MAGIC_SIZE;
} /* else: let xz code read & check it */
-
- /* Limit memory usage to about 64 MiB. */
+ /* First stream is identical to starting a new stream after finishing decoding an old one */
state = xz_dec_init(XZ_DYNALLOC, 64*1024*1024);
+ r = XZ_OK;
while (1) {
- enum xz_ret r;
if (iobuf.in_pos == iobuf.in_size) {
int rd = safe_read(src_fd, membuf, BUFSIZ);
@@ -73,12 +73,36 @@
total = -1;
break;
}
+ /* No more bytes in stream. Stop */
+ if (rd == 0) {
+ break;
+ }
iobuf.in_size = rd;
iobuf.in_pos = 0;
}
+ if (r == XZ_STREAM_END) {
+ /* Eat padding. Stream never starts with zeros, and padding is 32 aligned */
+ while ((iobuf.in_pos < iobuf.in_size) && (iobuf.in[iobuf.in_pos] == 0)) {
+ iobuf.in_pos += 1;
+ }
+ /* Reached end of buffer. Fill it again from stream */
+ if (iobuf.in_pos == iobuf.in_size) {
+ continue;
+ }
+ if(iobuf.in_pos % 4){
+ r = XZ_DATA_ERROR;
+ }
+ }
// bb_error_msg(">in pos:%d size:%d out pos:%d size:%d",
// iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size);
- r = xz_dec_run(state, &iobuf);
+ /* Initialize decoder for new stream. Limit memory usage to about 64 MiB. */
+ if (r == XZ_STREAM_END) {
+ state = xz_dec_init(XZ_DYNALLOC, 64*1024*1024);
+ r = XZ_OK;
+ }
+ if ((r == XZ_OK) || (r == XZ_UNSUPPORTED_CHECK)){
+ r = xz_dec_run(state, &iobuf);
+ }
// bb_error_msg("<in pos:%d size:%d out pos:%d size:%d r:%d",
// iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size, r);
if (iobuf.out_pos) {
@@ -87,7 +111,9 @@
iobuf.out_pos = 0;
}
if (r == XZ_STREAM_END) {
- break;
+ xz_dec_end(state);
+ /* Look for any other streams */
+ continue;
}
if (r != XZ_OK && r != XZ_UNSUPPORTED_CHECK) {
bb_error_msg("corrupted data");
@@ -95,7 +121,6 @@
break;
}
}
- xz_dec_end(state);
free(membuf);
return total;
Attachment:
hel0000.xz
Description: application/xz
Attachment:
hel000lo.xz
Description: application/xz
Attachment:
hel0000lo.xz
Description: application/xz
Attachment:
hello.xz
Description: application/xz
Attachment:
hello1.xz
Description: application/xz
Attachment:
signature.asc
Description: This is a digitally signed message part