[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] Enhance checksum support



Hi,

On Mon, 14 Jan 2008, Frank Lichtenheld wrote:
> dpkg-genchanges is easy insofar that not many programs actually need
> to read .changes files (dak, dput, and dupload come to mind). So we
> mostly just can increase the format number and let other people worry
> about whether they want support that new feature.
> 
> dpkg-source is of course more complicated. Also since the door for
> changing v2.0 is already closed we will need to label the next incompatible
> changes to the source format v3.0 (dpkg-source only checks the major number).

There's also a possibility of not breaking the compatibility by simply
adding a new field and leaving "Files" untouched:

Checksums:
 <kind-of-checksum> <checksum> <name>

I think it would be best that way. The size of the file then stay in the
Files field as would the md5sum. If the user enables additional checksums,
they end up in this new field.

> So we might need to begin collecting proposals for that (so far there is
> only joeyh's vcs support code AFAIR).

Yeah, it's definitely high on my personal priority list for dpkg-dev. I
wanted first to finish the cleanup of the old shared code that makes any
big changes painful. 

> One idea I would welcome ideas on is whether it might be a good idea to
> allow more than one checksum per file (which my current code doesn't support)?

I think it would be good, at least for consistency with APT which already
uses multiple checksums.

> +our %check_prog = ( 32 => 'md5sum', 40 => 'sha1sum', 64 => 'sha256sum' );

I don't like the idea to index the algorithm by the size of the hash. We
might want to be able to implement several algorithms whose output are of
the same size. (That's also why I added an explicit <kind-of-checksum> in
the Checksums field above).

> +sub getchecksum {
> +    my ($checksum_prog, $file) = @_;
> +    open(STDIN, "<", $file) ||
> +	syserr(_g("cannot open file %s for reading"), $file);
> +    (my @s = stat(STDIN)) || syserr(_g("cannot fstat file %s"), $file);
> +    my $size = $s[7];
> +    my $sum = `$checksum_prog`;
> +    $? && subprocerr("%s %s", $checksum_prog, $file);
> +    open(STDIN, "<", "/dev/null")
> +	|| &syserr(_g("reopen stdin from /dev/null"));
> +    $sum = extractchecksum($sum);
> +
> +    return ($sum, $size);
> +}

I don't like functions which have a side-effect of changing STDIN/STDOUT.
I know we have lots of those in dpkg-source and in various other scripts
but I'd rather get rid of that strange practice than use it for
convenience. 

We might want to write our own helper functions to fork and redirect
input/output in order to avoid duplicating too much the same logic but I
would prefer that to such tricks.

> index e27f294..a0b1a96 100644
> --- a/scripts/Makefile.am
> +++ b/scripts/Makefile.am
> @@ -89,6 +89,7 @@ nobase_dist_perllib_DATA = \
>  	Dpkg/Cdata.pm \
>  	Dpkg/Changelog.pm \
>  	Dpkg/Changelog/Debian.pm \
> +	Dpkg/Checksums.pm \
>  	Dpkg/Compression.pm \
>  	Dpkg/Control.pm \
>  	Dpkg/Deps.pm \
> diff --git a/scripts/po/POTFILES.in b/scripts/po/POTFILES.in
> index 4a14b35..58b0090 100644
> --- a/scripts/po/POTFILES.in
> +++ b/scripts/po/POTFILES.in
> @@ -18,6 +18,7 @@ scripts/Dpkg/Arch.pm
>  scripts/Dpkg/Cdata.pm
>  scripts/Dpkg/Changelog.pm
>  scripts/Dpkg/Changelog/Debian.pm
> +scripts/Dpkg/Compression.pm
>  scripts/Dpkg/Control.pm
>  scripts/Dpkg/Deps.pm
>  scripts/Dpkg/ErrorHandling.pm

You probably meant Checksums.pm here and not Compression.pm :)

Cheers,
-- 
Raphaël Hertzog

Le best-seller français mis à jour pour Debian Etch :
http://www.ouaza.com/livre/admin-debian/


Reply to: