[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#686502: pxz produces archives broken for busybox's unxz



On Thu, 2012-12-27 at 08:38 -0800, Jonathan Nieder wrote:
> Abou Al Montacir wrote:
> 
> > Hover, I assume we can save this extra code as soon as we don't loose
> > data.
> 
> That's fine with me.  All you'd need to do is error out if there is
> anything after the first stream.  That would make it a conformant
> decoder and prevent silent data loss, though it would mean busybox
> couldn't read the XZ files pxz produces.
Sure,

> (Context: the spec permits single-stream decoders because there are
> decoders in the wild that need to be very simple, like the one built
> into the Linux kernel that unpacks the kernel and initramfs.)
> 
> On the other hand, if busybox is to start decoding concatenated
> streams (imitating the standard "xz" command), then the spec requires
> also correctly implementing padding.  This might sound rigid, but it
> is important for interoperability --- without such requirements,
> whenever you share XZ files there would be a lot of confusion about
> whether it is valid and which implementations can and can't decode it.
Agree

> I think busybox upstream would agree that the spec shouldn't just be
> ignored.
You convinced me here. I admit you're right the we should either conform
to spec or conform to spec, no choice there.

I've fixed my patch and think that know it should really be conformant.
I also attached some short samples to be tested. One of them only should
fail to decode.

I really thank you for your support and review as well as for your sense
of details. I admit I've learned from you many things.

Thanks,
Author: Abou Al Montacir <abou.almontacir@sfr.fr>
Purpose: Fix decompression of multi stream XZ compressed files
 (Closes: bug#686502)

--- busybox-1.20.0/archival/libarchive/decompress_unxz.c	2012-04-22 03:33:23.000000000 +0200
+++ busybox-1.20.0/debian/build/deb/archival/libarchive/decompress_unxz.c	2012-12-27 21:58:49.000000000 +0100
@@ -44,6 +44,7 @@
 	struct xz_dec *state;
 	unsigned char *membuf;
 	IF_DESKTOP(long long) int total = 0;
+	enum xz_ret r;
 
 	if (!global_crc32_table)
 		global_crc32_table = crc32_filltable(NULL, /*endian:*/ 0);
@@ -59,12 +60,11 @@
 		strcpy((char*)membuf, HEADER_MAGIC);
 		iobuf.in_size = HEADER_MAGIC_SIZE;
 	} /* else: let xz code read & check it */
-
-	/* Limit memory usage to about 64 MiB. */
+	/* First stream is identical to starting a new stream after finishing decoding an old one */
 	state = xz_dec_init(XZ_DYNALLOC, 64*1024*1024);
+	r = XZ_OK;
 
 	while (1) {
-		enum xz_ret r;
 
 		if (iobuf.in_pos == iobuf.in_size) {
 			int rd = safe_read(src_fd, membuf, BUFSIZ);
@@ -73,12 +73,36 @@
 				total = -1;
 				break;
 			}
+			/* No more bytes in stream. Stop */
+			if (rd == 0) {
+				break;
+			}
 			iobuf.in_size = rd;
 			iobuf.in_pos = 0;
 		}
+		if (r == XZ_STREAM_END) {
+			/* Eat padding. Stream never starts with zeros, and padding is 32 aligned */
+			while ((iobuf.in_pos < iobuf.in_size) && (iobuf.in[iobuf.in_pos] == 0)) {
+					iobuf.in_pos += 1;
+			}
+			/* Reached end of buffer. Fill it again from stream */
+			if (iobuf.in_pos == iobuf.in_size) {
+				continue;
+			}
+			if(iobuf.in_pos % 4){
+				r = XZ_DATA_ERROR;
+			}
+		}
 //		bb_error_msg(">in pos:%d size:%d out pos:%d size:%d",
 //				iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size);
-		r = xz_dec_run(state, &iobuf);
+			/* Initialize decoder for new stream. Limit memory usage to about 64 MiB. */
+		if (r == XZ_STREAM_END) {
+			state = xz_dec_init(XZ_DYNALLOC, 64*1024*1024);
+			r = XZ_OK;
+		}
+		if ((r == XZ_OK) || (r == XZ_UNSUPPORTED_CHECK)){
+			r = xz_dec_run(state, &iobuf);
+		}
 //		bb_error_msg("<in pos:%d size:%d out pos:%d size:%d r:%d",
 //				iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size, r);
 		if (iobuf.out_pos) {
@@ -87,7 +111,9 @@
 			iobuf.out_pos = 0;
 		}
 		if (r == XZ_STREAM_END) {
-			break;
+			xz_dec_end(state);
+			/* Look for any other streams */
+			continue;
 		}
 		if (r != XZ_OK && r != XZ_UNSUPPORTED_CHECK) {
 			bb_error_msg("corrupted data");
@@ -95,7 +121,6 @@
 			break;
 		}
 	}
-	xz_dec_end(state);
 	free(membuf);
 
 	return total;

Attachment: hel0000.xz
Description: application/xz

Attachment: hel000lo.xz
Description: application/xz

Attachment: hel0000lo.xz
Description: application/xz

Attachment: hello.xz
Description: application/xz

Attachment: hello1.xz
Description: application/xz

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: