[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

tpm2licenses



Dear Frank!

During the weekend I could hack the tpm2licenses file a bit. I will
attach the full fill, but also include a annotated (+-)diff here.

The main changes are:
- new config variables:
  --listallfiles
    used to list all files even if no license information is present
  --texmfPath
    replaces the manual setting of the texmfPath variable in the script
- tpm files can be used without changes straight from the texlive tree
- the catalogue config variable can be file:xyz, in this case the
  file xyz should cotain tpm:licline lines (tpm without the leading .tpm)
- Coverage check: After going through the listing of licenses per file,
  a list of files which is not covered by one of the above statemnents
  is given
- texlive is not treated specifically, it also uses the TpmFileGlob
- missing directories are just next-ed and not died upon
  Reason behind at least for me: Some directories are just not present
  in the install-tree as the have been blacklisted in tpm2deb.cfg.
  I don't want tpm2licenses die here, maybe we can make it an option.
  I want to use it in the debian/rules file to generate the license text
  for all each package specifically (this works already in the depot).
 
Please Frank, could you test whether this works for the tetex packages,
too? For the texmfPath variable no change should be necessary as the
default is "." which is what is necessary for tetex. Same for the
listallfiles which is new and without it the script should work as
before.
 
TODO, or what I would like from tpm2licenses (and I didn't come around
to implement):
- treatment of the special cases of tpm2deb.cfg hacks (only for me)
- list of missing files (present in the tpm but not in the file system) 
  (do we need this is the License statement?)

Best wishes

Norbert



Usual blabla start

--- tpm2licenses	2006-04-13 13:48:08.000000000 +0200
+++ tpm2licenses.norbert	2006-04-25 15:39:31.000000000 +0200
@@ -1,7 +1,8 @@
 #!/usr/bin/perl -w
 #
 # tpm2licenses.pl
-# (c) 2005 Norbert Preining
+# (c) 2005-2006 Norbert Preining
+# (c) 2006 Frank Küster
 #
 # Lists for every filename.tpm the license as specified in the catalogue
 #
@@ -12,6 +13,7 @@
 #       --nocheckcatalogue
 #       --tpmdir
 #       --package
+#       --listallfiles
 # optional tpm file: check only that one
 #
 

catering for the new config variables and a fallbacl texlive package
in fact now it is only important to know eith texlive or tetex.

@@ -40,7 +42,7 @@
 
 
 # initialize AppConfig
-my $config = AppConfig->new("catalogue=s", "nocheckcatalogue", "tpmdir=s", "package=s", "what=s");
+my $config = AppConfig->new("catalogue=s", "nocheckcatalogue", "tpmdir=s", "package=s", "what=s", "listallfiles", "texmfPath=s");
 
 # parse configurationfile, if present
 my @cfgDirs = (".","./debian","..","~");
@@ -61,11 +63,15 @@
 my $debian_package = $config->package() ? $config->package() : "tetex-base";
 my $tpmdir = $config->tpmdir() ? $config->tpmdir() : "./debian/tpm";
 my $nocatalogue = $config->nocheckcatalogue() ? $config->nocheckcatalogue() : '';
+my $listallfiles = $config->listallfiles() ? 1 : 0;
+my $texmfPathString = $config->texmfPath() ? $config->texmfPath() : ".";
+my @texmfPath = split ' ', $texmfPathString;
 
 if ($debian_package) {
   die "Unknown Debian package: $debian_package." unless
       ( $debian_package =~ /^tetex-base$/    || 
 	$debian_package =~ /^tetex-src$/     ||
+	$debian_package =~ /^texlive$/       ||
 	$debian_package =~ /^texlive-base$/  ||
 	$debian_package =~ /^texlive-extra$/ ||
 	$debian_package =~ /^texlive-lang$/  ||


global variable which collected the covered files

@@ -134,6 +140,7 @@
 File::Basename::fileparse_set_fstype('unix');
 
 my @TpmList;
+my @coveredfiles;
 
 if (@ARGV) {
   # we have a (list of) packages on the command line


CreateTpmList is just compressed to the TpmDirGlob as texlive tpms can
be used without changes.

@@ -155,121 +162,112 @@
 my $printfiles = '';
 
 sub create_tpmlist {
-
-  if 	( $debian_package =~ /^tetex-/ ) {
-    foreach (<$TpmDirGlob >) {push(@TpmList,$_)};
-  };
-
-  if 	( $debian_package =~ /^texlive-/ ) {
-    my $cfgfile = "../../" . $debian_package . ".tpm4licenses.cfg";
-    my @cfgLines;
-    open CFGFILE, $cfgfile or die "could not open $cfgfile";
-    while (<CFGFILE>) {
-      # this could go into one line (next if...) if only Emacs would grok it...
-      if (m/^#/) { 
-	  next ; 
-	}
-      chomp;
-      push(@cfgLines,$_);
-    };
-    for (@cfgLines) {
-      my $tpmFullname;
-      if ( -f "texmf/tpm/" . $_ ) {
-	$tpmFullname = "texmf/tpm/" . $_
-	  }
-      elsif ( -f "texmf-dist/tpm/" . $_ ) {
-	$tpmFullname = "texmf-dist/tpm/" . $_
-	  }
-      elsif ( -f "texmf-doc/tpm/" . $_ ) {
-	$tpmFullname = "texmf-doc/tpm/" . $_
-	  }
-      else {
-	print STDERR "Could not find $_\n";
-	exit 1;
-      };
-      push(@TpmList,$tpmFullname);
-    };
-  }; #end texlive
+  foreach (<$TpmDirGlob >) {push(@TpmList,$_)};
 };
 


This seems to be missing in the original, as printfiles was once set to
true it was always true ...

The next just ignores non-package tpms (collection-* bin-*).

 sub list_licenses {
   foreach $LocalTPM (@TpmList) {
+    $printfiles = '';
     $licline = "";
     $bn = &basename($LocalTPM,".tpm");
+    next if ($bn =~ m/bin-|collection-/);
     if (defined($Tpm2Catalogue{$bn})) {
       $pkgcat = $Tpm2Catalogue{$bn};
     } else {
       $pkgcat = $bn;
     }
     $licline .= "$bn: ";


This just introduces and aditional level of if, ie if the Catalogue
setting is something like file:.... then the licline is grep-ed from it.
Ok, this is just a quick hack, but really I did want to get it running
soon ;-)
(this is the biggest part of the restructuring but was in fact only a
reindentation to the next if level, besides one or two small other
changes to make it work with listallfiles)

-    my $fletter = substr($pkgcat, 0, 1);
-    my $catname = "${Catalogue}/entries/$fletter/${pkgcat}.xml";
-    if (! -r $catname) {
-      $catname = "$tpmdir/${pkgcat}.xml";
+    if ($Catalogue =~ m/file:(.*)$/) {
+      $licline = `grep ^${bn}: $1`;
+      chomp $licline;
+      if ($licline eq "") { $licline = "tpm $bn not found in $1, strange"; }
+      $printfiles = 1;
+    } else {
+      my $fletter = substr($pkgcat, 0, 1);
+      my $catname = "${Catalogue}/entries/$fletter/${pkgcat}.xml";
....
many lines killed
....


etc as above, 
I reorganized the printing so that --listalllines works.  So the next
has to go if there is catalogue and the parsing of the xml file should
only be done unless ( ... (! -r $catname) ...)

+      unless ($nocatalogue  || (! -r $catname) || $pkgcat =~ m/^individual.*/) { 

...
-    $what eq "files" && ($printfiles || $nocatalogue) && printFiles($LocalTPM,$licline);
+    $what eq "files" && print "\n% $licline\n";
+    if ($what eq "files" && ($printfiles || $nocatalogue || $listallfiles)) {
+      printFiles($LocalTPM,$licline);
+    }
   }

Check the coverage. I have no idea how one can give MORE than one debian
package, but I use it atm always only for one, so what the foreach
(@debian_pacakge) is for I hae no idea.

+  CheckCoverage();
 

I also changed that the licline is printed out NOT in printFiles, but
independently, so that the --listallfiles works

 
   sub printFiles {
     my ($LocalTPM,$licline)= @_;
-    my $pkg_header = "% " . $licline;
+    my $pkg_header = "";
     my $dom_parser = new XML::DOM::Parser;
     my $doc = $dom_parser->parsefile($LocalTPM);
     my %SourceFiles = Tpm::getListField($doc, "SourceFiles");
     my %RunFiles = Tpm::getListField($doc, "RunFiles");
     my %DocFiles = Tpm::getListField($doc, "DocFiles");
 

The following is the trick to get around the bad hashed problem in as-is
tpm files. The idea is the following: Iff (if and only if) the {text}
key was missing, also the {size} key is missing in the original hash.
The getTextEntry (or so) sets the {text} key in any case, but we can
check whether it was present in the original case by checking for the
{size} key, and if this one is missing, setting the {text} value to the
empty string (where before was the bad bad hash).

+    #
+    # NORBERT
+    # getListField returns a hash, and s{text} SHOULD be an array reference
+    # why isn't it like this???
+    # If it would be an array reference one could easily check whether
+    # sourcefile(text) is empty or not!!!
+    # Trick: If it was emtpy there is not size key!
+    #
+    if (!defined($SourceFiles{"size"})) { 
+	$SourceFiles{"text"} = ""; 
+    }
+    if (!defined($DocFiles{"size"})) { 
+	$DocFiles{"text"} = ""; 
+    }
+    if (!defined($RunFiles{"size"})) { 
+	$RunFiles{"text"} = ""; 
+    }
+     
     foreach ($RunFiles{"text"}, $DocFiles{"text"}, $SourceFiles{"text"}) { 
       # this is already done in Tpm.pm, why isn't that sufficient?
       $_ =~ s/^\n*// ;

The texmfPath is now a global variable set by config

@@ -306,54 +304,87 @@
     @SourceFiles = grep(!/^$/,@SourceFiles);
 
     for ($debian_package) {
-      my @texmfPath;
+      #my @texmfPath;
       if ( /^texlive/ ) { 
-	@texmfPath = ("texmf","texmf-dist","texmf-doc");
+	#@texmfPath = ("texmf","texmf-dist","texmf-doc");

The comment says all: As the doc files are installed into /u/s/d/pkg/ in
the texlive packages, and the compatibility links are not already
present, we add the doc/texlive-*/ dir to the texmfPath in the calls to
tpm2license so that the files are found. For this to work I have to
remove the forst doc/ part of the doc files.

+	#
+	# DocFiles are installed into /u/s/d/pkg/...
+	# do we have to strip the first doc/ part
+	@DocFiles = map { $_ =~ s,^doc/,, ; $_; } @DocFiles ;
 	foreach (@RunFiles) {CheckFileExistence($_,\@texmfPath)};
 	foreach (@DocFiles) {CheckFileExistence($_,\@texmfPath)};
 	foreach (@SourceFiles) {CheckFileExistence($_,\@texmfPath)};
 	MergeDirectories(\@RunFiles,\@texmfPath);
 	MergeDirectories(\@DocFiles,\@texmfPath) if (@DocFiles);
 	MergeDirectories(\@SourceFiles,\@texmfPath) if (@SourceFiles);

Separation from printing the license line from the printFiles.
And global texmfPath mixed

-	print "\n" . $pkg_header . "\n";
 	print @RunFiles;
 	print @DocFiles;
 	print @SourceFiles;
       };
       if ( /^tetex-base$/ ) {
-	@texmfPath = (".");
+	#@texmfPath = (".");
 	foreach (@RunFiles) {CheckFileExistence($_,\@texmfPath)};
 	foreach (@DocFiles) {CheckFileExistence($_,\@texmfPath)};
 
 	MergeDirectories(\@RunFiles,\@texmfPath);
 	MergeDirectories(\@DocFiles,\@texmfPath) if (@DocFiles);
-	print "\n" . $pkg_header . "\n";
 	print @RunFiles;
 	print @DocFiles;
       };
       if ( /^tetex-src$/ ) {
-	@texmfPath = (".");
+	#@texmfPath = (".");
 	foreach (@SourceFiles) {CheckFileExistence($_,\@texmfPath)};
 	MergeDirectories(\@SourceFiles,\@texmfPath);
 	unless (! @SourceFiles) {
-	  print "\n" . $pkg_header . "\n";
 	  print @SourceFiles;
 	}
       };
     };
   }
 

The implementation of the coverage check. It depends on the fact the
CheckFileExistence below adds each present file to the coveredfiles
array (global).

+  sub CheckCoverage {
+    my @allfilesinpackage;
+    my @notcoveredfiles;
+    foreach my $tmf (@texmfPath) {
+      push @allfilesinpackage, `find $tmf -type f`;
+    }
+    chomp @allfilesinpackage;
+    foreach (@allfilesinpackage) {
+      next if (m/\.tpm$/);
+      if (!(in_list($_,@coveredfiles))) {
+        push @notcoveredfiles, $_;
+      }
+    }
+    print "\n\nCOVERAGE CHECK:";
+    if ($#notcoveredfiles < 0) {
+      print "OK\n";
+    } else {
+      print "NOT COVERED FILES:\n";
+      foreach (@notcoveredfiles) {
+        print $_,"\n";
+      }
+    }
+  }
+

If you know a better way, tell me ;-) memq I would use with lisp!

+  sub in_list {
+    my ($what, @list) = @_;
+    foreach (@list) { 
+      if ($what eq $_) { return 1; }
+    }
+    return 0;
+  }
+

I wanted to be sure that the texmfPath is not shadowed, but keep the
original layout, so renamed the variable funny. We can just remove the
second variable if we don't need it.
(BTW: why not CheckFileExistence($_,@texmfPath) followed by
my ($file,@texmfPath) = @_? Wouldn't it be simpler? Ok but slower
because an array instead of a reference is handled, but how man items we
have?)


   sub CheckFileExistence {
-    my ($file,@texmfPath) = ($_[0],@{$_[1]});
+    my ($file,@fooonotusedtexmfPath) = ($_[0],@{$_[1]});
     my $found = 0;
     foreach my $texmfDir (@texmfPath) {
-      -f $texmfDir . "/" . $file && ($found =1);
+      -f $texmfDir . "/" . $file && ($found =1) && push @coveredfiles , "$texmfDir/$file" ;
     };
     print STDERR "$file: Does not exist!\n" if ! $found;
   }
 
   sub MergeDirectories {
-    my ($filelist,@texmfPath) = ($_[0],@{$_[1]}); # $filelist is actually a pointer
+    my ($filelist,@fooonotusedtexmfPath) = ($_[0],@{$_[1]}); # $filelist is actually a pointer
     # create a list of dirnames, and remove duplicates
     my @dirnames = map {dirname($_) } @{$filelist};
     my %UniqueHash = map { $_ , 1 } @dirnames;

Here is the part about killing if a directory is not found.

@@ -374,7 +405,8 @@
 	  $fullDir =  ( $_ . "/" . $dirname );
 	};
       };
-      $fullDir or die "This should not happen: no directory $dirname, nowhere.";
+      $fullDir or next;
+      #$fullDir or die "This should not happen: no directory $dirname, nowhere.";
       my  @InstalledFiles = `find $fullDir -maxdepth 1 -type f 2>/dev/null | grep -v tetex` 
 	  or die "Calling find for $dirname, expanded to $fullDir, failed.";
       for (@InstalledFiles) {
#!/usr/bin/perl -w
#
# tpm2licenses.pl
# (c) 2005-2006 Norbert Preining
# (c) 2006 Frank Küster
#
# Lists for every filename.tpm the license as specified in the catalogue
#
# usage:
# perl tpm2licenses.pl <options> [tpm file]
# where <options> =
# 	--catalogue
#       --nocheckcatalogue
#       --tpmdir
#       --package
#       --listallfiles
# optional tpm file: check only that one
#

BEGIN {   # get our other local perl modules.
  ($mydir = $0) =~ s,/[^/]*$,,;
  if ($mydir eq $0) { $mydir = `pwd` ; chomp($mydir); }
  if (!($mydir =~ m,/.*,,)) { $mmydir = `pwd`; chomp($mmydir); $mydir = "$mmydir/$mydir" ; }
  unshift (@INC, $mydir);
#  unshift (@INC, "$mydir/..");
}

use strict;
use Data::Dumper;
#use Getopt::Long;
use File::Basename;
use File::Copy;
use File::Path;
use File::Temp qw/ tempfile tempdir /;
use AppConfig;
#use XML::DOM;
use Cwd;
#use FileUtils qw(canon_dir cleandir make_link newpath member
#		 normalize substitute_var_val dirname diff_list remove_list
#		 rec_rmdir sync_dir walk_dir start_redirection stop_redirection);
#use Tpm;


# initialize AppConfig
my $config = AppConfig->new("catalogue=s", "nocheckcatalogue", "tpmdir=s", "package=s", "what=s", "listallfiles", "texmfPath=s");

# parse configurationfile, if present
my @cfgDirs = (".","./debian","..","~");
my $cfgName = ".tpm2license.cfg";

for my $cfgDir (@cfgDirs) {
    if ( -r "$cfgDir/$cfgName" ) {
	print STDERR "Using configuration file $cfgDir/$cfgName\n";
	$config->file("$cfgDir/$cfgName");
      };
  };
# now parse commandline
$config->getopt();

# assign conffile, commandline or default values:
my $Catalogue = $config->catalogue() ? $config->catalogue() : "/src/TeX/texcatalogue/" ;
my $what = $config->what() ? $config->what() : "files";
my $debian_package = $config->package() ? $config->package() : "tetex-base";
my $tpmdir = $config->tpmdir() ? $config->tpmdir() : "./debian/tpm";
my $nocatalogue = $config->nocheckcatalogue() ? $config->nocheckcatalogue() : '';
my $listallfiles = $config->listallfiles() ? 1 : 0;
my $texmfPathString = $config->texmfPath() ? $config->texmfPath() : ".";
my @texmfPath = split ' ', $texmfPathString;

if ($debian_package) {
  die "Unknown Debian package: $debian_package." unless
      ( $debian_package =~ /^tetex-base$/    || 
	$debian_package =~ /^tetex-src$/     ||
	$debian_package =~ /^texlive$/       ||
	$debian_package =~ /^texlive-base$/  ||
	$debian_package =~ /^texlive-extra$/ ||
	$debian_package =~ /^texlive-lang$/  ||
	$debian_package =~ /^texlive-doc$/   ||
	$debian_package =~ /^texlive-bin$/     );
};

# texlive
# my $TpmDirGlob = $Master . "./texmf-dist/tpm/*.tpm";
# teTeX
my $TpmDirGlob = "$tpmdir/*.tpm";

# only needed if we're in the sourcedir, so no need to bother
my $sourceDir;
chomp( $sourceDir = `pwd`); 
$sourceDir .= "/";

# require Strict;
require XML::DOM;
require FileUtils;
import FileUtils qw(canon_dir cleandir make_link newpath member
		    normalize substitute_var_val diff_list remove_list
		    rec_rmdir sync_dir walk_dir start_redirection stop_redirection);
require Tpm;
#
# what the hell, how do I import this array from Tpm.pm ???
#
my %Tpm2Catalogue = (
		     "ctib" => "ctib4tex",
		     "CJK" => "cjk",
		     "bayer" => "universa",
		     "bigfoot" => "suffix",
		     "cb" => "cbgreek",
		     "cd-cover" => "cdcover",
		     "cmex" => "cmextra",
		     "cs" => "csfonts",
		     "cyrplain" => "t2",
		     "devanagr" => "devanagari",
		     "eCards" => "ecards",
		     "ESIEEcv" => "esieecv",
		     "euclide" => "pst-eucl",
		     "GuIT" => "guit",
		     "HA-prosper" => "prosper",
		     "ibycus" => "ibycus4",
		     "ibygrk" => "ibycus4",
		     "IEEEconf" => "ieeeconf",
		     "IEEEtran" => "ieeetran",
		     "iso" => "isostds",
		     "iso10303" => "isostds",
		     "jknapltx" => "jknappen",
		     "kastrup" => "binhex",
		     "le" => "frenchle",
		     "mathtime" => "mathtime-ltx",
		     "omega-devanagari" => "devanagari-omega",
		     "pdftexdef" => "pdftex-def",
		     "procIAGssymp" => "prociagssymp",
		     "resume" => "res",
		     "SIstyle" => "sistyle",
		     "SIunits" => "siunits",
		     "syntax" => "syntax2",
		     "Tabbing" => "tabbing" );

my $parser = new XML::DOM::Parser;
my $startdir=getcwd();
chdir($startdir);
File::Basename::fileparse_set_fstype('unix');

my @TpmList;
my @coveredfiles;

if (@ARGV) {
  # we have a (list of) packages on the command line
  @TpmList = @ARGV;
}
else {
  create_tpmlist();
};

list_licenses();

1;

my $LocalTPM;
my $licline;
my $bn;
my $pkgcat;
my $node;
my $printfiles = '';

sub create_tpmlist {
  foreach (<$TpmDirGlob >) {push(@TpmList,$_)};
};

sub list_licenses {
  foreach $LocalTPM (@TpmList) {
    $printfiles = '';
    $licline = "";
    $bn = &basename($LocalTPM,".tpm");
    next if ($bn =~ m/bin-|collection-/);
    if (defined($Tpm2Catalogue{$bn})) {
      $pkgcat = $Tpm2Catalogue{$bn};
    } else {
      $pkgcat = $bn;
    }
    $licline .= "$bn: ";
    if ($Catalogue =~ m/file:(.*)$/) {
      $licline = `grep ^${bn}: $1`;
      chomp $licline;
      if ($licline eq "") { $licline = "tpm $bn not found in $1, strange"; }
      $printfiles = 1;
    } else {
      my $fletter = substr($pkgcat, 0, 1);
      my $catname = "${Catalogue}/entries/$fletter/${pkgcat}.xml";
      if (! -r $catname) {
        $catname = "$tpmdir/${pkgcat}.xml";
        if (! -r $catname) {
	  $licline .= "not-in-catalogue";
        };
      }
      my $ltype;
      unless ($nocatalogue  || (! -r $catname) || $pkgcat =~ m/^individual.*/) { 
        #don't try to parse the xml file if we don't have a catalogue
        my $cat = $parser->parsefile($catname);
        my ($version, $lversion, $lchecked, $luser, $lfile);
        $node = $cat->getElementsByTagName("version")->item(0);
        if ($node) {
	  $version = $node->getAttribute("number");
        }
        $node = $cat->getElementsByTagName("license")->item(0);
        if ($node) {
	  # ok we have a license entry in there
	  $ltype = $node->getAttribute("type");
	  $lversion = $node->getAttribute("version");
	  $lchecked = $node->getAttribute("checked");
	  $luser = $node->getAttribute("username");
	  $lfile = $node->getAttribute("file");
        }
        if ("$lversion$lchecked$luser" eq "") {
	  if ("$ltype" eq "") {
	    $licline .= "unknown";
	  } else {
	    $licline .= "$ltype (unverified)";
	    # we know the license, it makes sense to output the files
	    $printfiles = '1';
	  }
        } else {
	  $version ||= ''; # make sure we have no uninitialized string values
	  $lversion ||= '';
	  $licline .= "$ltype (verification data:$version:$lversion:$lchecked:$luser:$lfile)";
	  $printfiles = '1';
        }
      }
      if ( $pkgcat =~ m/^individual.*/ ) {
        $ltype = $pkgcat;
        $ltype =~ s/individual_(.*)/$1/;
        $licline = "$pkgcat $ltype (verification data:::::header)";
        $printfiles = '1';
      };
    } # else part of Catalogue = file:...
    $what eq "license" && print "$licline\n";
    # we know the license, it makes sense to output the files
    $what eq "files" && print "\n% $licline\n";
    if ($what eq "files" && ($printfiles || $nocatalogue || $listallfiles)) {
      printFiles($LocalTPM,$licline);
    }
  }
  CheckCoverage();


  sub printFiles {
    my ($LocalTPM,$licline)= @_;
    my $pkg_header = "";
    my $dom_parser = new XML::DOM::Parser;
    my $doc = $dom_parser->parsefile($LocalTPM);
    my %SourceFiles = Tpm::getListField($doc, "SourceFiles");
    my %RunFiles = Tpm::getListField($doc, "RunFiles");
    my %DocFiles = Tpm::getListField($doc, "DocFiles");

    #
    # NORBERT
    # getListField returns a hash, and s{text} SHOULD be an array reference
    # why isn't it like this???
    # If it would be an array reference one could easily check whether
    # sourcefile(text) is empty or not!!!
    # Trick: If it was emtpy there is not size key!
    #
    if (!defined($SourceFiles{"size"})) { 
	$SourceFiles{"text"} = ""; 
    }
    if (!defined($DocFiles{"size"})) { 
	$DocFiles{"text"} = ""; 
    }
    if (!defined($RunFiles{"size"})) { 
	$RunFiles{"text"} = ""; 
    }
     
    foreach ($RunFiles{"text"}, $DocFiles{"text"}, $SourceFiles{"text"}) { 
      # this is already done in Tpm.pm, why isn't that sufficient?
      $_ =~ s/^\n*// ;
      # remove the texmf-dist/ we don't need
      $_ =~ s@texmf-dist/@@g;
      # make sure there's exactly one newline at the end
      chomp;
      $_ =~ s/$/\n/ ;
    };

    # we don't want the tpm file which isn't installed
    $RunFiles{"text"} =~ s/\n.*\.tpm$//m;

    my @SourceFiles = split(/\n/m,$SourceFiles{"text"});
    my @RunFiles = split(/\n/m,$RunFiles{"text"});
    my @DocFiles = split(/\n/m,$DocFiles{"text"});
    foreach (@SourceFiles) { 
      s/^\s//;
      s@^[\s\n]*(.*)[\s\n]*$@$1@so;
      s@\n\s*@\n@gm;
    };
    foreach (@RunFiles) { 
      s/\s//;
      s@^[\s\n]*(.*)[\s\n]*$@$1@so;
      s@\n\s*@\n@gm;
    };
    foreach (@DocFiles) { 
      s/\s//;
      s@^[\s\n]*(.*)[\s\n]*$@$1@so;
      s@\n\s*@\n@gm;
    };
    @DocFiles = grep(!/^$/,@DocFiles);
    @RunFiles = grep(!/^$/,@RunFiles);
    @SourceFiles = grep(!/^$/,@SourceFiles);

    for ($debian_package) {
      #my @texmfPath;
      if ( /^texlive/ ) { 
	#@texmfPath = ("texmf","texmf-dist","texmf-doc");
	#
	# DocFiles are installed into /u/s/d/pkg/...
	# do we have to strip the first doc/ part
	@DocFiles = map { $_ =~ s,^doc/,, ; $_; } @DocFiles ;
	foreach (@RunFiles) {CheckFileExistence($_,\@texmfPath)};
	foreach (@DocFiles) {CheckFileExistence($_,\@texmfPath)};
	foreach (@SourceFiles) {CheckFileExistence($_,\@texmfPath)};
	MergeDirectories(\@RunFiles,\@texmfPath);
	MergeDirectories(\@DocFiles,\@texmfPath) if (@DocFiles);
	MergeDirectories(\@SourceFiles,\@texmfPath) if (@SourceFiles);
	print @RunFiles;
	print @DocFiles;
	print @SourceFiles;
      };
      if ( /^tetex-base$/ ) {
	#@texmfPath = (".");
	foreach (@RunFiles) {CheckFileExistence($_,\@texmfPath)};
	foreach (@DocFiles) {CheckFileExistence($_,\@texmfPath)};

	MergeDirectories(\@RunFiles,\@texmfPath);
	MergeDirectories(\@DocFiles,\@texmfPath) if (@DocFiles);
	print @RunFiles;
	print @DocFiles;
      };
      if ( /^tetex-src$/ ) {
	#@texmfPath = (".");
	foreach (@SourceFiles) {CheckFileExistence($_,\@texmfPath)};
	MergeDirectories(\@SourceFiles,\@texmfPath);
	unless (! @SourceFiles) {
	  print @SourceFiles;
	}
      };
    };
  }

  sub CheckCoverage {
    my @allfilesinpackage;
    my @notcoveredfiles;
    foreach my $tmf (@texmfPath) {
      push @allfilesinpackage, `find $tmf -type f`;
    }
    chomp @allfilesinpackage;
    foreach (@allfilesinpackage) {
      next if (m/\.tpm$/);
      if (!(in_list($_,@coveredfiles))) {
        push @notcoveredfiles, $_;
      }
    }
    print "\n\nCOVERAGE CHECK:";
    if ($#notcoveredfiles < 0) {
      print "OK\n";
    } else {
      print "NOT COVERED FILES:\n";
      foreach (@notcoveredfiles) {
        print $_,"\n";
      }
    }
  }

  sub in_list {
    my ($what, @list) = @_;
    foreach (@list) { 
      if ($what eq $_) { return 1; }
    }
    return 0;
  }

  sub CheckFileExistence {
    my ($file,@fooonotusedtexmfPath) = ($_[0],@{$_[1]});
    my $found = 0;
    foreach my $texmfDir (@texmfPath) {
      -f $texmfDir . "/" . $file && ($found =1) && push @coveredfiles , "$texmfDir/$file" ;
    };
    print STDERR "$file: Does not exist!\n" if ! $found;
  }

  sub MergeDirectories {
    my ($filelist,@fooonotusedtexmfPath) = ($_[0],@{$_[1]}); # $filelist is actually a pointer
    # create a list of dirnames, and remove duplicates
    my @dirnames = map {dirname($_) } @{$filelist};
    my %UniqueHash = map { $_ , 1 } @dirnames;
    @dirnames = keys %UniqueHash;

    # For searching, we create a hash that contains the filenames as keys:
    my %SearchHash;
    %SearchHash = map { $_, 1 } @{$filelist} ;

    my %DirComplete = map { $_, 1 } @dirnames;
    for (@dirnames) {
      my $dirname = $_;
      my $fullDir;
      my $rootDir;
      for (@texmfPath) {
	if ( -d ( $_ . "/" . $dirname )) { 
	  $rootDir = $_;
	  $fullDir =  ( $_ . "/" . $dirname );
	};
      };
      $fullDir or next;
      #$fullDir or die "This should not happen: no directory $dirname, nowhere.";
      my  @InstalledFiles = `find $fullDir -maxdepth 1 -type f 2>/dev/null | grep -v tetex` 
	  or die "Calling find for $dirname, expanded to $fullDir, failed.";
      for (@InstalledFiles) {
	chomp;
	s@^$rootDir/@@;
	$DirComplete{$dirname} = 0 unless $SearchHash{$_};
      };
      if ( $DirComplete{$dirname} ) {
	for (@{$filelist} ) {
	  # replace the file by its directory name
	  s@$dirname/.*@$dirname/*@;
	};
      };
# 	print STDERR "Directory $_ is $DirComplete{$dirname}\n";
    };

    # now the complete directories occur multiple times, remove duplicates again
    %UniqueHash = map { ("$_\n" , 1) } @{$filelist} ;
    @{$filelist} = keys %UniqueHash;
  }

}


Reply to: