[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] Enhance checksum support



On Sat, Jan 19, 2008 at 06:15:45PM +0100, Frank Lichtenheld wrote:
> > Having it be:
> >   Contents: sha256
> >    28ee6a10eb280ede4b19c1b975aff5533016a26de67ba9212d51ffaea020ce34 355 foo
> >   Files:
> >    4bf7ff17bd9ddf3846d9065b3c594fb4 355 foo
> > or similar would be nice and non-redundant, and make it possible to drop
> I can see the "nice". But once I want to include more than one checksum
> it quickly gets redundant.

Well, it's "redundant" in the sense it repeats filename and size info; but
size is an integral part of the hashing (for at least some hashes it's much
easier to break them if you can have different sized files).

The advantage of having all that in one place means you can verify the
hash properly with a simple script like:

	cat *.changes | 
	  sed -ne '/^Checksums: sha256/,/^[^ ]/{/^ /p}' | 
	  while read hash size file; do 
	     if [ "$(wc -c $file)" -eq $size ] && \
	     [ "$(sha256sum < $file) | cut -d\  -f1)" = "$hash" ];
	     then
	        echo "$file is OK"
	     else
	        echo "$file is BAD"
	     fi
	  done

ie, you just need to find the right section of the .changes/.dsc file, but
at that point parsing is trivial. If you don't like sed linenoise, grep-dctrl
does the same job:

	gpg < ${changes} 2>/dev/null |
		grep-dctrl -s Checksums-sha256 "" | grep '^ ' |
		while read hash size file; do ...

> So maybe keep the Checksums field and introduce a Contents field that
> contains no checksums, but only the size and the name?
> Checksums:
>   md5 4bf7ff17bd9ddf3846d9065b3c594fb4 foo
>   sha256 28ee6a10eb280ede4b19c1b975aff5533016a26de67ba9212d51ffaea020ce34 foo

Having the hash as a parameter instead of in the field is a bit confusing
but still easy to parse; having the size separated out makes things much
more awkward though.

By confusing, I mean things like:

	Checksums:
	 sha1 a0a53d15d7dbc6a9cdfd1889ae30ba8b3dbf7d94 foo
	 md5 4bf7ff17bd9ddf3846d9065b3c594fb4 bar

become ambiguous -- is it an error, or are you meant to be using md5 to
verify bar and sha1 to verify foo, along the lines of Tim's suggestion?
It's also slightly harder to see what hashes are in use -- you can't just
check the presence of a field, you have to grep the contents of the field.

For comparison, the Release files use the format:

	MD5Sum:
	 {md5} {size} {path}
	SHA1:
	 {sha1} {size} {path}
	SHA256:
	 {sha256} {size} {path}

Cheers,
aj

Attachment: signature.asc
Description: Digital signature


Reply to: