[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1020241: debian-policy: copyright-format: Formatting improvements/changes



On Sun, 2022-09-18 at 18:01:38 -0700, Russ Allbery wrote:
> Guillem Jover <guillem@debian.org> writes:
> 
> > Oh! I've completely missed this all this time, I think because that
> > feels very weird given that it has no synopsis and the text is added
> > already on the first line on the spec. :/
> 
> Other formatted fields with the same semantics are Source, Disclaimer, and
> Comment.  I don't think there are any fields in debian control files with
> those semantics (Description is the only formatted field and it has a
> synopsis), but there are several of them in copyright files.
> 
> Source is another ongoing minor problem, since it's *usually* a URL but is
> not required to be one, and sometimes a textual description of the source
> is needed.  Here too, a structured format would have been nicer, so that
> you could have something like:
> 
> source:
>   urls:
>     - https://example.com/foo
>     - https://example.org/foo
>   comment: >-
>     The foo-rewrite script was originally posted to comp.unix.sources in
>     1992 but otherwise has no source other than the Debian package.

I think Disclaimer and Comment do not seem as problematic because they
tend to contain descriptive prose. For Source it's true that it's
weird as it seems to indeed want to have two different semantics
depending on the content, and considering the current deb822 format,
perhaps having used two different field names might have been better
as you alluded with your YAML example, say Source-Urls (line-based list),
and Source-Comment (formatted text). Such split still seems feasible
and backwards compatible right now though.

But see below.

> > Right, the problem I see is that applying this formatting to a field
> > that has no special treatment for the first line just after the field
> > name seems very unintuitive, because the first line does not contain an
> > additional prefixing space, or if it does no one is adding it!
> 
> > It feels very weird to me that all these would be equivalent:
> 
> >   Copyright: Something long that might trigger some wrapping behavior
> >     Other thing very long that might not be clear behaves as the above
> >     More
> 
> > and
> 
> >   Copyright:  Something long that might trigger some wrapping behavior
> >     Other thing very long that might not be clear behaves as the above
> >     More
> 
> > and
> 
> >   Copyright:
> >     Something long that might trigger some wrapping behavior
> >     Other thing very long that might not be clear behaves as the above
> >     More
> 
> I think my brain just assumes that all whitespace after the colon of a
> field name and before the first non-whitespace character is ignored, so
> doesn't have a problem with that, but I can see why it would be confusing.

Just to try to clarify to make sure we are on the same page (if we are,
sorry for the obvious!). What I find confusing is that the semantics
of the field imply different line-wrapping semantics depending on leading
spaces, and because there's no synopsis, the first line is supposed to
act just like the rest, but if spaces are ignored, then how do you
select either of the line-wrapping behaviors for the first line? Also
because adding such spaces after the colon look like typographic errors
to me somehow.

So I think what seems most confusing to me is that for formatted-text
fields with no synopsis, the first line is being used at all, because
that messes with the intuition on how the Description field operates.

> > Otherwise, if the current semantics are retained, at least for me, the
> > first line behavior really needs to be clarified.
> 
> Yes, we should distinguish between formatted text with synopsis and
> formatted text without synopsis more clearly.

Yes.

> Or, you know, just propose
> a new YAML format which would make it trivial to clean up all of these
> problems *and* would provide first-class editor support and easy parsing
> in every major programming language.  :)  But that's WAY bigger than this
> bug.

Ahem, yeah. :)

> > If we end up switching the field semantics, that seems it might cause
> > unnecessary modification churn, given that I (not sure whether other
> > people have done this before than me as well) at least have "instigated"
> > unintentionally this type of change in several places (packages I
> > maintain, golang/prometheus team), including tooling (AFAIR dh-make and
> > dh-make-golang), and other people might have also picked this up too. :/
> 
> I think making the field a line-based list is the obviously correct thing
> to do.  It's just not backward-compatible, so we will have to face the
> question of how we handle a version bump in the copyright file (and of
> course figure out if we're going to deal with all of the other requests
> that would require a version bump).

I was thinking that perhaps an easy way out, might be to say that if
the field contains an empty first line (nothing after the colon), then
it's line based, otherwise it's considered formatted text. Which makes
things more complex (perhaps "only" for a transition period), but might
be considered backwards compat?

> And I have packages where individual copyright lines are longer than 80
> columns, so we either have to require unwrapped lines greater than 80
> columns (which I'd rather not do), or we have to define line wrapping
> semantics for line-based lists, which adds yet more irritating ugliness to
> the deb822 format.  Probably just "if the line is indented by more than
> one space, it's a continuation for the previous line" I guess.

Yeah, that also crossed my mind (for example for long copyright year
lists, or long institution names). For output fields, long lines seem
OKish, but for human editable things long lines would certainly be
annoying. But as you say wrapping for line-based lists has annoying
semantics too. I have no good ideas for this right now though. :/

Thanks,
Guillem


Reply to: