invalid and duplicate architectures
Hi,
this all started when I learned that an architecture wildcard like "any-armhf"
is in fact invalid and in contrast to my earlier belief does not match the
debian architecture "armhf":
$ dpkg-architecture -iany-armhf -aarmhf && echo yes || echo no
no
Since I thought it would be nice if lintian could warn about usage of such
invalid architecture wildcards in package meta data I started writing a script
that checks for the magnitude of the problem.
In case this is useful for anybody, attached please find a script which, given
a Sources file prints:
- invalid architecture wildcards in build-depends and conflicts (prefix ID),
the Architecture field (prefix IA) and generated binary packages as listed
in the Package-List field (prefix IB)
- superfluous wildcards (lists of wildcards that match any architecture more
than once) in build-depends and conflicts (prefix DD), the Architecture
field (prefix DA) and generated binary packages as listed in the
Package-List field (prefix DB)
Some (maybe) interesting results:
Most common invalid architectures:
$ ./findarchwildcardproblems.pl Sources | egrep '(ID|IA|IB)' | cut -d ' ' -f 3 | sort | uniq -c | sort -n
1 avr
1 darwin-any
1 disabled
1 freebsd-any
1 knetbsd-any
1 kopensolaris-amd64
1 kopensolaris-any
1 linux-ia64
1 linux-ppc64el
1 netbsd-any
1 openbsd-any
1 sh3eb
1 sh4eb
1 solaris-amd64
1 solaris-any
1 solaris-i386
2 any-armeb
2 any-armel
2 any-armhf
2 any-avr32
2 any-m32r
2 any-s390
2 any-sh3
2 any-sh3eb
2 any-sh4eb
2 m32r
2 netbsd-alpha
2 sh3
6 netbsd-i386
9 kfreebsd-alpha
9 knetbsd-alpha
10 knetbsd-i386
10 kopensolaris-i386
10 or1k
15 any-ia64
17 any-ppc64el
17 hurd-amd64
23 avr32
31 armeb
32 hurd-alpha
37 lpia
74 ppc64el
123 arm
434 mips64
434 mips64el
441 mipsn32
441 mipsn32el
634 ia64
676 s390
Packages with superfluous wildcards:
$ ./findarchwildcardproblems.pl Sources | egrep '(DD|DA|DB)'
DD: ettercap arm64 libluajit-5.1-dev
DD: gcc-3.3 hurd-i386 locales
DB: gcc-defaults s390x
DD: gcc-snapshot sh4 gnat-4.9
DA: hyperestraier amd64
DA: hyperestraier i386
DA: kfreebsd-10 kfreebsd-amd64
DA: kfreebsd-10 kfreebsd-i386
DA: kfreebsd-10 kfreebsd-amd64
DA: kfreebsd-10 kfreebsd-i386
DA: kfreebsd-9 kfreebsd-amd64
DA: kfreebsd-9 kfreebsd-i386
So in summary, invalid wildcards are mostly produced from either old
architectures that are not in the archive anymore or new architectures that are
not in the archive yet or some creative other cases like "any-armhf" or
"disabled".
Superfluous wildcards seem to be a much smaller problem and only concern seven
source packages.
In my code I counted all debian architectures as "valid" which are listed on
packages.debian.net. Is there a better way to retrieve "valid" architectures in
this context?
Does it make sense to bugreport some of these problems or do you think that
would be a wasted effort?
cheers, josch
#!/usr/bin/perl
use strict;
use warnings;
# print invalid architecture wildcards (doesnt match any existing architecture)
# and duplicate wildcards (an architecture is matched by more than one
# wildcard) in build dependencies, conflicts, the architecture field and in
# binary packages listed in the Package-List field
use Dpkg::Control;
use Dpkg::Compression::FileHandle;
use Dpkg::Deps;
use List::MoreUtils qw{any};
use List::Util qw{first};
use Dpkg::Arch qw(debarch_is);
my $desc = $ARGV[0]; # /home/josch/gsoc2012/bootstrap/tests/sid-sources-20140101T000000Z
if (not defined($desc)) {
die "need filename";
}
my $fh = Dpkg::Compression::FileHandle->new(filename => $desc);
my @debarches = ("amd64", "armel", "armhf", "hurd-i386", "i386", "kfreebsd-amd64", "kfreebsd-i386", "mips", "mipsel", "powerpc", "s390x", "sparc",
"alpha", "arm64", "hppa", "m68k", "powerpcspe", "ppc64", "sh4", "sparc64", "x32");
while (1) {
my $cdata = Dpkg::Control->new(type => CTRL_INDEX_SRC);
last if not $cdata->parse($fh, $desc);
my $pkgname = $cdata->{"Package"};
next if not defined($pkgname);
my @depfields = ('Build-Depends', 'Build-Depends-Indep', 'Build-Depends-Arch',
'Build-Conflicts', 'Build-Conflicts-Indep', 'Build-Conflicts-Arch');
# search for invalid arches in the dependency and conflict fields
foreach my $depfield (@depfields) {
my $dep_line = $cdata->{$depfield};
next if not defined($dep_line);
foreach my $dep_and (split(/\s*,\s*/m, $dep_line)) {
my @or_list = ();
foreach my $dep_or (split(/\s*\|\s*/m, $dep_and)) {
my $dep_simple = Dpkg::Deps::Simple->new($dep_or);
my $depname = $dep_simple->{package};
next if not defined($depname);
my $arches = $dep_simple->{arches};
next if not defined($arches);
# find wildcards that do not match any existing architecture
foreach my $arch (@{$arches}) {
$arch =~ s/^!//;
next if (any {debarch_is($_,$arch)} @debarches);
print "ID: $pkgname $arch $depname\n";
}
# search for duplicate arches in arch restrictions
# set match frequency to zero for all arches
my %matchfreq = ();
foreach my $arch (@debarches) {
$matchfreq{$arch} = 0;
}
# find duplicates
foreach my $arch (@{$arches}) {
$arch =~ s/^!//;
foreach my $a (@debarches) {
if (debarch_is($a, $arch)) {
$matchfreq{$a} += 1;
}
}
}
# print duplicate matches
foreach my $arch (@debarches) {
if ($matchfreq{$arch} > 1) {
print "DD: $pkgname $arch $depname\n";
}
}
}
}
}
# search for invalid arches in Architecture field
my $architecture = $cdata->{"Architecture"};
if (defined($architecture)) {
# find wildcards that do not match any existing architecture
foreach my $arch (split(/\s+/m, $architecture)) {
next if ($arch eq "all");
next if (any {debarch_is($_,$arch)} @debarches);
print "IA: $pkgname $arch\n";
}
# search for duplicate arches in Architecture field
# set match frequency to zero for all arches
my %matchfreq = ();
foreach my $arch (@debarches) {
$matchfreq{$arch} = 0;
}
# find duplicates
foreach my $arch (split(/\s+/m, $architecture)) {
next if ($arch eq "all");
foreach my $a (@debarches) {
if (debarch_is($a, $arch)) {
$matchfreq{$a} += 1;
}
}
}
# print duplicate matches
foreach my $arch (@debarches) {
if ($matchfreq{$arch} > 1) {
print "DA: $pkgname $arch\n";
}
}
}
# gather the architectures of the generated binary packages
my $packagelist = $cdata->{"Package-List"};
if (defined($packagelist)) {
foreach my $line (split(/\n/m, $packagelist)) {
my $architecture = first { /^arch=/ } split(/\s+/m, $line);
next if (not defined($architecture));
$architecture =~ s/^arch=//;
# find wildcards that do not match any existing architecture
foreach my $arch (split(/,/m, $architecture)) {
next if ($arch eq "all");
next if (any {debarch_is($_,$arch)} @debarches);
print "IB: $pkgname $arch\n";
}
# search for duplicate arches in Architecture field
# set match frequency to zero for all arches
my %matchfreq = ();
foreach my $arch (@debarches) {
$matchfreq{$arch} = 0;
}
# find duplicates
foreach my $arch (split(/,/m, $architecture)) {
next if ($arch eq "all");
foreach my $a (@debarches) {
if (debarch_is($a, $arch)) {
$matchfreq{$a} += 1;
}
}
}
# print duplicate matches
foreach my $arch (@debarches) {
if ($matchfreq{$arch} > 1) {
print "DB: $pkgname $arch\n";
}
}
}
}
}
Reply to: