[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Enabling uscan to simply remove files from upstream source



Hi,

third summary of the proposal

 1. The new field Files-Excluded in debian/copyright contains a space
    separated list of globs (as used by find and for specifying file
    lists in machine readable debian/control files). The deletion
    process will loop over every expression

      rm -rf ${MAIN_SOURCE_DIR}/<expression>

    An example copyright file would look like this:

Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Source: http://susy.oddbird.net/
  Repackaged, excluding non-DFSG licensed fonts and source-less
  JavaScript
Files-Excluded:
  docs/source/fonts/*
  docs/source/javascripts/jquery-1.7.1.min.js
  docs/source/javascripts/modernizr-2.5.3.min.js

 2. If files matching are contained in the source tarball this will
    be repackaged except if the option --no-exclusion is given at
    uscan command line or if USCAN_NO_EXCLUSION is set in
    /etc/devscripts.conf or ~/.devscripts.

 3. If the tarball did not contained any of the globs in
    debian/copyright::Files-Excluded it should be left untouched
    (except if the repackaging is needed because of compression method
    anyway if the user forces --repack).

 4. In case something was removed the version string will be appended by
    '+dfsg' to express the fact that the content of the original source
    was changed.

This discussion brought up additional new wishlist features for uscan:

 a) Configurable option when repacking (this is somehow related to
    the suggestion above but I would like to split up this to a
    different task).

 b) Uscan should be enabled to download VCS repositories (and once
    it does deletion of files should be possible according to the same
    method above (this is an interesting feature in principle but once
    uscan is able to delete files it can do it for any download method).

 c) The suggested repackaging method was requested for non-uscan
    based downloads (for instance from VCS) which might have an
    influence for the final implementation as a separate tool which
    could simply called by uscan (and others).

 d) Enable confirguration of compression method.  I'd consider this
    an unrelated feature which also could be useful for --repack.
    I admit once we are repackaging anyway it might be reasonable to
    be able to influence the compression method but I also would like
    to split this up to a different task.

Regarding the implementation there was some uncertainity about the
actual Perl module to use.  In the attached example script I decided to
stick to Dpkg::Control and left the code for Parse::DebControl as a
comment which could pretty easily could replace the other parser.  The
code works for me however, there might be some remaining empty
directories which I'm tempted to delete these as well via an "educated"

   find tmp -type d -empty -delete

which means I would care for deleting only those directories that became
empty by the previous removal process and not those directories which
were originally empty in the tarball.  On the other hand we might simply
go with those empty dirs that finally do not harm.

Any further hints / remarks?

Kind regards

      Andreas.

--
http://fam-tille.de
#!/usr/bin/perl                                           
use strict;
use warnings;

my $parsefile = 'debian/copyright';

# Dpkg::Control::Hash prefered by James McCoy (who did the last three uscan.pl edits using a debian.org e-mail address)
use Dpkg::Control::Hash;
my $data = Dpkg::Control::Hash->new();
$data->load($parsefile);

# Parse::DebControl suggested by Jonas Smedegaard
# use Parse::DebControl;
# my $parser = new Parse::DebControl(1);
# my $data = $parser->parse_file($parsefile, {discardCase=>1,singleBlock=>1,});

my $okformat = qr'http://www.debian.org/doc/packaging-manuals/copyright-format/1.0';
my $main_source_dir = $ARGV[0] ;
die unless ($data->{'format'} =~ m{^$okformat/?$});
if ( $data->{'files-excluded'} ) {
    my $nfiles_before = `find $main_source_dir | wc -l`;
    foreach (grep {/\//} split /\s+/, $data->{"files-excluded"}) {
        `find $main_source_dir -path "$main_source_dir/$_" -delete`;
    };
    foreach (grep {/^[^\/]+$/} split /\s+/, $data->{"files-excluded"}) {
        `find $main_source_dir -type f -name $_ -delete`;
    };
    my $nfiles_after = `find $main_source_dir | wc -l`;
    if ( $nfiles_before == $nfiles_after ) {
	print "Source tree remains identical - no need for repacking.\n"
    }
}

Reply to: