[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#865769: Second data package including some machine-readable data



Hi!

On Sat, 2017-06-24 at 09:57:33 -0700, Russ Allbery wrote:
> Package: debian-policy
> Version: 4.0.0.2
> Severity: wishlist

> A discussion in #865720 got me thinking that there is some data maintained
> in Policy that would be useful to have in a machine-readable format.  The
> things that have occurred to me so far are:
> 
> - The list of registered virtual packages

This one definitely makes sense, because policy is the canonical place
where this is defined.

> - The list of archive sections and their descriptions

I think this belongs on each archive providing those, alongside the
other archive metadata. And I'd rather see the involved parties
defining an appropriate file to provide so that any downloader which
has to fetch the matadata anyway would use instead of hardcoding it.

Using a file from policy does not seem useful to me, because it would
mean software would need to depend on such policy provided package,
and if you are going to mix and match repos, you really need the
metadata from the archive you are pulling from.

In addition the text in policy states that the canonical list is
maintained by the archive anyway. :)

> - The list of valid Debian control field names (by type of control file)

This one, I'm uncertain, but I'd tend to think it is partly in a similar
situation to the previous one.

For example dpkg contains already such a list (provably more
exhaustive) in Dpkg::Control::Fields, and I don't see making dpkg
depend on an external list, because dpkg is being used beyond Debian.

The "list" in dpkg has currently some problems though:

 - in a perl module; not that easily accessible to other languages.
 - tracks on which control file the fields are available, but cannot
   currently distinguish the differing semantics (field separator) for
   fields with the same name, f.ex. Files in .changes and .dsc.
 - lacks information whether a field is folded, simple, multiline, etc.

My plan is to remedy at least the last two points with a new perl
module hiearchy. I'm not sure if the first is worth "fixing" in dpkg
though?

For the equivalent in policy I think I see where you are coming from,
and I think it would be nice to have most of policy in a declarative
format that could be used by linters, or some parsers, but if that
means it's going to make those somewhat Debian-specific it might not
take off. I guess to avoid that the path and names to get to that
information would need to be somewhat neutral and allow for other
derivatives with their own policies. :)

> These are things that either we already maintain or that have no other
> obvious place to live.  This data could then be consumed by packages like
> lintian (although that's a bit tricky for lintian.debian.org),
> libconfig-model-dpkg-perl, etc.

The list of common licenses perhaps. Other things that come to mind
could be perhaps a file with common regexes to validate things that
policy specifies, say package names, version strings etc. Precisely
because those can and do diverge from what dpkg accepts for example.

Valid pathnames, etc, and as I've mentioned above ideally all of
policy would be available in a declarative format, but that'd be a
pretty huge undertaking. But then it might make sense to do a quick
poll and ask whether people would use any of this, because otherwise
it seems perhaps a bit like a waste.

> The idea would be to provide these in some machine-readable form (probably
> JSON unless someone has objections) in files under /usr/share/debian-policy
> or some similar path (so that software can consume them) in a separate
> binary package built from the debian-policy package (debian-policy-data,
> perhaps) so that other packages can depend on that package without pulling
> in the larger human-focused Policy documentation.

I don't think I have a direct use for any of the above anyway, but I
also think I'd prefer YAML, because it is more human readable. But not
a strong objection in any case.

Thanks,
Guillem


Reply to: