[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Refactoring unpack/list-${type}pkg and Read_pkglists



On 2011-07-13 01:07, Russ Allbery wrote:
> Niels Thykier <niels@thykier.net> writes:
> 
>> [...]
>> Secondly... Are there any use for the "architecture" and the "files"
>> (not to be confused with the "file") field in the source package?
>> Architecture does not appear to be referenced and the "files" field
>> appears to be corrupted in all existing files, so I doubt we use it.
> 
> Yeah, files looks like it's not used.  The goal was originally to
> accumulate all of the files that a *.dsc file refers to, but it looks like
> we're getting the size instead.  My guess is that there's no point to that
> field.
> 
> I don't think either architecture or standards-version are used.  I
> suspect we could drop both of those unless we wanted to generate reports
> on the architecture and standards version used in the archive for some
> reason.  (And we could always add them back in later.)
> 

I got a private branch with Read_pkglists all gone in favour of a
"PackageList" class/module (except for reporting/*[0]).  I have also
started to process of making the Lab maintain its own packages list.
Since I have run into some minor issues, I figured it would be a good
time to stop and get some fresh eyes on the problem and my proposed
solution.

My grand vision is to move the "what changed" into PackageList (or at
least closer to it) and let the Lab use these files as a manifest of
what is here.  My hope is that a "mirror sync" gets the flow of:

 my ($src_list, $pkg_list, $udeb_list) = gen_pkg_list($mirror);
 my $lab = Lab->new($dir);
 my $diff = $lab->generate_pkg_diff($src_list, $pkg_list, $udeb_list);

(Note; $lab->generate_pkg_diff would probably delegate it to the
PackageList class, but that is a different story).

My first problem is adding/replacing a package in the lab, because the
package lists (in some cases) appear to contain more information than
the available in the package meta data.  This means adding a source
package via the dsc file (e.g. lintian --unpack $dsc) does not produce
the same result as using unpack/list-srcpkg.

The full list of fields we need are in [1], but the problematic one
appear to be "area" for source files.  For bin files we could extract
the deb control file and (ab)use the $section field, but I cannot seem
to find something useful for the source package[2].

I am tending towards just blanking the field in this case and let
html_reports do a src->bin check for the area.


There is also a second problem; namely making a package list file for
changes-files (if only for symmetry).  Though I think source, version,
file and timestamp should be enough for that.


If this plan work out, harness would be updated to use the lab to create
the mirror diff.  With the diff, it would then hack the log (as it does
now) and then have lintian do it regular run.
  On a related note, I am thinking of making lintian accept a
--from-file (or similar) argument that makes lintian read files (or
package names) from a file (or stdin).  Looking at the code, the
--packages-file seems like overkill (we discard all but the file field).

~Niels

[0] Which I just realized are horribly broken without said module... oh
well.

[1]
  package (bin)
  source (bin,src)
  source-version (bin)
  version (bin,src)
  file (bin,src) => $file
  timestamp (bin,src) => stat $file
  area (bin,src) $section in bin, but not src?
  files (src) - broken anyway
  standards-version (src)
  architecture (src)
  maintainer (src)
  uploaders (src)

[2] That is, short of unpacking the debian part and pulling $section
from the first binary in d/control.


Reply to: