[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: DEP-3: Patch Tagging Guidelines



Hello everybody,

I'll try to do some new proposals based on your feedback. But first let
me address the topic of the usefulness of the proposal. While there are
currently no tools making use of this format, I can imagine many
interesting usage for this information. It starts with the simple stats
(how many debian specific patch do we use?) and goes on to providing
a nice web interface where people can browse all patches:
- check all non-forwarded patches and help forwarding them
- let upstream developers browse all patches which are not backports
- let other distributions check all patches which are not debian-specific
In any case, it's a required step IMO if we want to increase the visibility
of our patches and ensure they are better reviewed.

As you have noted, the format is quite simple and you don't have to use
everything it can offer at the start if you feel it would take too much
time. Only Description and Origin are required. The requirement is not
enforced but it's the minimum needed to make it DEP3-compliant. :)

While a formal process is not needed to start using something like this in
my own packages (and I already did), I believe it's important to have this
discussion so that we have something broadly accepted to build upon.
Otherwise I keep reinventing field names and it's more difficult to
integrate it in the Debian ecosystem (lintian, dpkg-dev, etc.).

I hope we can all agree that it can be useful, that we can make it
lightweight enough so that it's not a big pain for maintainers and as
such that's it's a step in the right direction.

Now let's go back to the content of the proposal.

On Mon, 15 Jun 2009, Sune Vuorela wrote:
> Wouldn't a better first goal be to have just a freeform text field ?
> With the current amount of comments in the patches of random packages I
> touch, just a oneliner would be a significatn improvement.
> 
> AS long as not a major part of debian people comment their patches, the
> format really doesn't matter.

I agree that one sentence is better than nothing, however I don't see why
this would forbid other maintainers to use something more elaborated.
Furthermore the format allows for simple things like:

Description: Change the pid file path
Origin: Debian

So hopefully the format will not make maintainers run away because it's
too complex.

On Mon, 15 Jun 2009, Mark Brown wrote:
> On Mon, Jun 15, 2009 at 06:12:49PM +0200, Raphael Hertzog wrote:
> >   * `Signed-off-by` (optional)
> > 
> >     This field can be used to document the fact that the patch has been
> >     reviewed by one or more persons. It should list their names and
> >     emails in the standard format (similar to the example given for
> >     the `Origin` field), separated by commas if needed.
> 
> For the avoidance of confusion I would suggest that this be changed to
> Reviewed-by - the normal Linux/git Signed-off-by has a specific meaning
> that needn't include actually doing a code review.

I started first with "Reviewed-by" and then thought that it was stupid to
not reuse the name that is already vastly used for a similar purpose. What
do other people think? I'm fine with both names.

On Tue, 16 Jun 2009, Charles Plessy wrote:
> The dh_make template for debian/copyright induces many developers to put their
> packaging work under the GPL, and I have already seen packages whose license is
> otherwise BSD-ish with such patches. If the maintainer suddenly goes MIA and
> the patch is non-trivial, then in theory if we want to respect what is written,
> we are stuck with a GPL'ed patch. Therefore, we have an optional License field
> to make things crystal clear if necessary.

I have no opposition against an optional License field. Can you try to word a
description for it?

On the other side, I'm also not convinced it's really useful... if a patch
author wants some specific licence different from upstream's license, he
should make that explicit in the patch itself when he adds his own
copyright notice.

> for your effort to unifiy the format. Personally, I do not mind changing our
> local format for the DEP3 format as long as we have one release cycle to do it.
> Some of our packages have a very slow turnover.

There's no forced switch planned... it has not technical impact on the
distribution, so I don't mind if not all packages are converted, after
it's up to you to see if new lintian warnings annoy you enough or not to
live with it. :)

----

Now I'll switch to the discussion about the Origin/Status/Patch fields.
It seems that this set of fields is not as optimized as it could be.

On Mon, 15 Jun 2009, Josselin Mouette wrote:
> Le lundi 15 juin 2009 à 18:12 +0200, Raphael Hertzog a écrit :
> >   * `Patch` (optional)
> 
> Maintaining this information up-to-date can be troublesome.
> 
> >   * `Status` (optional)
> 
> Same here. At the moment a package is uploaded, it can be
> must-be-forwarded, then later it becomes forwarded and later on it can
> change again. Which means a lot of additional commits, and that the
> version in sid can be easily outdated.

Josselin noted that updating the Patch/Status fields can be easily missed
when the patch has been forwarded upstream. And in general, any
meta-information can become outdated. It would be good to have a way
to detect this in some way. I propose to add a new optional field called
"Last-Update" with a date value (YYYY-MM-DD) that maintainers concerned
about this can use.

But this is just a helper. Something even better would be to not have
to update existing fields but just add new fields as new steps are
done. By combining several suggestions (quoted below), we can
have some interesting proposal.

On Tue, 16 Jun 2009, Felipe Sateler wrote:
> >   * Has the patch been forwarded upstream?
> I think answering a simple question (and one of the most likely to be asked 
> about a patch) should be done by a simple rule.

On Tue, 16 Jun 2009, sean finney wrote:
> >   * Has the patch been forwarded upstream?
> why not have a field for that instead?  for example:
> 
>  * `Forwarded` (optional)
> 
>    Has the bug been forwarded to upstream?  Any value other than "No" or "no"
>    will be taken as a postive value.  If the value is a URL, then it is
>    inferred that it points to an online discussion related to forwarding
>    the patch to upstream.  The value is also implicitly considered
>    positive if there are upstream bug references elsewhere in the headers.

On Wed, 17 Jun 2009, Ben Finney wrote:
>   * `Origin` (required)
> 
>     This field should document the origin of the patch. It must be of
>     the following forms::
> 
>         keyword
>         keyword ":" description
> 
>     The description is free-form text giving more information about the
>     origin of the patch. The keyword is one of the following:
> 
>         * `upstream`, for a patch cherry-picked from the upstream VCS
>           and applied directly.
> 
>         * `backport`, for an upstream patch modified to apply to the
>           current version of the source.
> 
>         * `other`, for an origin not covered by any of the above
>           keywords.
> 
> Perhaps that could be made more concise, but I hope the intent is clear
> this way.

Merging all those ideas, I suggest we drop Status/Commit/Patch and use the
following format:

Origin: <upstream|backport|vendor|other> : <url-or-commit>
Forwarded: <yes|no|not-needed|<url-or-text>>

In "Origin" we put the link where we grabbed the patch when we added it to
our package, and in "Forwarded" we put the link to the patch in the upstream
bugtracker/mailing list/etc once it has been forwarded. If the "Forwarded"
field is missing, its value is assumed to be "no" unless we have a "Bug"
field, in which case it is assumed to be yes. If the patch is
vendor-specific, it should have "Forwarded: not-needed" and Description
should explain why it's vendor specific.

We also add a supplementary "Author" field, in case we want to record the
information of who wrote/maintains the patch.


Here's a first patch for all this, comments/reviews welcome:
--- a/web/deps/dep3.mdwn
+++ b/web/deps/dep3.mdwn
@@ -62,77 +62,74 @@ of any other distribution that tracks the same problem/patch.
 
     This obligatory field contains at least a short description on the
     first line. Supplementary lines can be used to provide a longer
-    explanation of the patch.
+    explanation of the patch and its history.
+
+    This field should explain why the patch is vendor-specicific (ex:
+    branding patch) when that is the case. If the patch has been submitted
+    upstream but has been rejected, the description should also document
+    why it's kept and what were the reasons for the reject.
 
   * `Origin` (required)
 
-    This field should document the origin of the patch. It can have the
-    following standard values: "upstream" (in the case of a patch cherry-picked
-    from the upstream VCS), "backport" (in the case of an upstream patch
-    that had to be modified to apply on the current version). Any other
-    value is supposed to be free-form text describing the origin of the
-    patch. It should typically be the name and email of the patch author
-    (ex: "`John Bear <foo@bar.net>`") or the project/company that she worked
-    for when she authored the patch.
+    This field should document the origin of the patch. It starts with a
+    single keyword that can have the following standard values: "upstream"
+    (in the case of a patch cherry-picked from the upstream VCS),
+    "backport" (in the case of an upstream patch that had to be modified
+    to apply on the current version), "vendor" for a patch created
+    by Debian or another distribution vendor, or "other" for all other
+    kind of patches. It can be optionally followed by a colon with
+    leading/trailing spaces before/after it. In that case, all the content
+    after the colon contains supplementary information defining in more
+    details the origin of the patch. In most cases, it should be a simple
+    URL. For patches backported/taken from upstream, it should point into
+    the upstream VCS web interface when possible, otherwise it can simply
+    list the relevant commit identifier (it should be prefixed with
+    "commit:" in that case). For other cases, one should simply indicate
+    the URL where the patch got grabbed (mailing list archives,
+    distribution bugtrackers, etc.) when possible.
 
   * `Bug-<Vendor>` or `Bug` (optional)
 
-    It contains one or more URLs (space separated) pointing to the related bugs
-    (possibly fixed by the patch). The `Bug` field is reserved
-    for the bug URL(s) in the upstream bug tracker.
+    It contains one URL pointing to the related bug (possibly fixed by the
+    patch). The `Bug` field is reserved for the bug URL in the upstream
+    bug tracker. Those fields can be used multiple times if several
+    bugs are concerned.
 
-  * `Patch` (optional)
+  * `Forwarded` (optional)
 
-    URL pointing to the patch. It can be in a VCS web interface,
-    a BTS attachment, etc. If the patch is available in the upstream VCS
-    or BTS, those URLs take precedence.
+    Any value other than "no" or "not-needed" means that the patch
+    has been forwarded upstream. Ideally the value is an URL proving
+    that it has been forwarded and where one can find more information
+    about its inclusion status.
+ 
+    If the field is missing, its implicit value is "yes" if the "Bug"
+    field is present, otherwise it's "no". The field is really required
+    only if the patch is vendor specific, in that case its value should
+    be "not-needed" to indicate that the patch must not be forwarded
+    upstream (whereas "no" simply means that it has not yet been done).
 
-  * `Commit` (optional)
+  * `Author` (optional)
 
-    One or more commit identifiers. This should only be used when the
-    `Patch` field can't be used because the upstream project has no VCS web
-    interface or similar restrictions.
-
-  * `Status` (optional)
-
-    This field is a textual explanation of its status concerning its
-    inclusion in the upstream project. The first line should consist of a
-    single keyword among "&lt;vendor&gt;-specific" (the patch must not be
-    forwarded as it is specific to a vendor, ex: branding patches), 
-    "must-be-forwarded" (nobody has taken the time to forward the patch
-    yet), "forwarded" (the patch has been forwarded, the Bug or Patch
-    fields should contain the corresponding URLs) or "rejected" (the patch
-    has been submitted but it has been rejected upstream). Supplementary
-    lines can be used to explain in more details the status of the patch.
-    It should be used for example to explain why the patch has been
-    rejected, or why this change is only meaningful for the vendor.
+    This field can be used to record the name and email of the patch author
+    (ex: "`John Bear <foo@bar.net>`"). Its usage is recommended when the
+    patch author did not add copyright notices for his work in the patch
+    itself. It's also a good idea to add this contact information when
+    the patch needs to be maintained over time because it has very little
+    chance of being integrated upstream.
 
   * `Signed-off-by` (optional)
 
     This field can be used to document the fact that the patch has been
-    reviewed by one or more persons. It should list their names and
-    emails in the standard format (similar to the example given for
-    the `Origin` field), separated by commas if needed.
+    reviewed by someone. It should list her name and email in the standard
+    format (similar to the example given for the `Author` field). This
+    field can be used mutiple times if several persons reviewed the
+    patch.
 
+  * `Last-Update` (optional)
 
-Interpretation
---------------
-
-In the analysis of the meta-information, a certain number of related
-facts can be deduced from the absence, presence, or combinations of fields
-and their values.
-
-  * Has the patch been forwarded upstream?
-
-    If there is a `Status` field, its value answers the question.
-    If there's an `Origin` field and it contains "upstream" or "backport",
-    the patch comes from upstream and it doesn't need to be forwarded.
-    The same is true if there's a `Commit` field.
-    In other cases, if there is a `Bug` field, then the patch has most
-    likely been referenced in the bug and upstream should know about it.
-    Any package maintainer should ensure that the existence of the patch
-    has been documented in the upstream bug log when he adds the
-    patches' meta-information.
+    This field can be used to record the date when the meta-information
+    have been last updated. It should use the ISO date format
+    `YYYY-MM-DD`.


And to finish, some small answers to various points raised: 

On Tue, 16 Jun 2009, sean finney wrote:
> On Mon, Jun 15, 2009 at 06:12:49PM +0200, Raphael Hertzog wrote:
> >   * `Description` (required)
> > 
> >     This obligatory field contains at least a short description on the
> >     first line. Supplementary lines can be used to provide a longer
> >     explanation of the patch.
> 
> given that the patches which currently have useful information do so
> via freeform comments, i think that it would be very useful to explicitly
> include this in the format.  i.e. anything that doesn't conform to
> the standard header/value format could be taken as the description.

Does it need to be in the format? I see that rather as a supplementary
feature of any patch parser that tries to make use of this format. But
I can add such a recommendation in the "Structure" section. Would you like
to try to draft one?

> >   * `Bug-<Vendor>` or `Bug` (optional)
> > 
> >     It contains one or more URLs (space separated) pointing to the related bugs
> >     (possibly fixed by the patch). The `Bug` field is reserved
> >     for the bug URL(s) in the upstream bug tracker.
> 
> if Bug can refer to more than one bug than perhaps "Bugs" would be better, or
> alternatively you could allow the field to have duplicate instances.

I went for the duplicate instance in the patch above.

> Also, from reading this i'm assuming that debian bugs would be identified
> by "Bug(s)-Debian"?  that seems a bit unwieldly, esp given that there will
> likely be more references to debian bugs than upstream/cross-stream bugs.
> Maybe we should also add a special shorthand for "Closes: #nnn" or similar?
> 
> My personal preference is that Debian gets "Bug" and there's a seperate
> "Bug(s)-Upstream" field, but maybe there are also arguments to the
> contrary?

One of the goals is also to make it easier to share patches among
distribution vendors. So I don't really like to make the format too much
Debian-centric.

> > What about using Debian: (like Ubuntu's Patch Tagging Guidelines) to
> > indicate which Debian bug is fixed by this patch?
> 
> Debian: could be considered a shorthand alias for Bug-Debian maybe?  I
> guess that could also address the above issue that i mentioned.

See my answer to Lucas.

>  * `Debian-Specific` (optional)
> 
>    Is this patch a debian-specific patch that is not intended to be
>    shared with upstream?  For example, changes to specify Debian-specific
>    application paths, configuration file locations, etc.

Already dealt with Forwarded + Description in the new proposal.

> jftr i have no problem implementing whatever comes from this into the
> patch tracking system, which should hopefully make the efforts more
> visible (and maybe even more useful). 

Thanks !


Thank you to whoever read until the end, it was a bit long. But it's a
difficult task to reconcile all remarks in something that makes sense
and that really improves the proposal. I think we did a step in the right
direction.

Cheers,
-- 
Raphaël Hertzog

Contribuez à Debian et gagnez un cahier de l'admin Debian Lenny :
http://www.ouaza.com/wp/2009/03/02/contribuer-a-debian-gagner-un-livre/


Reply to: