[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: the quality of Debian's diff.gz



On Sunday 01 June 2008, Manoj Srivastava wrote:
> On Sun, 1 Jun 2008 13:37:43 +0300, George Danchev <danchev@spnet.net> said:
--cut--
>         Peer reviewers can either look at the SCM archive (since that is
>  how the package is developed), or the divergence bugs, when they get
>  filed. I am not sure the diff.gz is the best way for collaborative
>  development.

Of course diff.gz is not the best way for collab development, but it is a 
pretty decent and cheap way to review how the upstream code has been patched. 
Think security and safety wise.

> >> Anything less is not good practice; we should be sending our
> >> divergence upstream.
> >
> > Very good, but please make these easily visible/readable to the rest
> > via diff.gz
>
>         Why via the diff.gz?  Why is the divergence bug mechanism not
>  being considered?

Because people would hardly trust BTS for sourceful changes being applied to a 
particular source tree.

> >> > devscripts package to use debcheckout, they might be pure $UNIX
> >> > users relying on patch, diff and a simple text editor.  So, will
> >> > you generate at some point a series logically separeted quilt
> >> > patches and store them in debian/patches/ in the final diff.gz
> >> > which is the canonical way of Debian to distibute changes.
> >>
> >> This can only work if the topic branches are non-overlapping; and
> >> that might cover a lot of cases, but not all.
> >>
> >> I am not so sure it is indeed a canonical way of doing things in
> >> Debian.  If it is, which canon do you follow?
> >
> > The above sentence of mine says: "diff.gz which is the canonical way
> > of Debian to distibute changes.". A normative document stipulates:
> > "debianisation diff is a unified context diff (diff -u) giving the
> > changes which are required to turn the original source into the Debian
> > source.". The mere request is to make your debianisation diff a good
> > citizen, which should be able to create the logically separated
> > changes to the upstream code clearly identified and documented diff
> > files. Then, the rest of the world can just get that diff.gz from
> > myriad of media *at their will* and try it against whatever
> > orig.tar.gz or local SCM working copy they have.
>
>         I think that the diff.gz might not be the best way of
>  disseminating the changes to the rest of the community. Distributed
>  SCM's are far more powerful than stacked patches, and I want to bring
>  my downstream the benefits of a distributed version control system, and
>  bring them into the development fold, so to say.

That leads to unification of a single SCM for packaging. I'm not really 
hopeful it is possible within Debian, are you ?

> >> > There should be ways to use both, since you depreciate your diff.gz
> >> > and it turns to be a useless scratch of bits. Then, again why have
> >> > diff.gz at all when it is not credible enough ?
> >>
> >> Credible \Cred"i*ble\ (kr[e^]d"[i^]*b'l), a. [L. credibilis, fr.
> >> credere. See {Creed}.]  Capable of being credited or believed; worthy
> >> of belief; entitled to confidence; trustworthy.
> >>
> >> I find the diff.gz as credible as anything else. Why would it be less
> >> trustworthy?
> >
> > It is less trustworthy since it applies in a combined fashion multiple
> > changes to the upstream tree, which leaves the bad impression of that
> > whoever created it doesn't have a clear idea of what he was doing.
>
>         I think that such a conclusion merely shows the naivete of the
>  person reaching that conclusion. The diff.gz represents the integration
>  branch, is meant to deliver the sources the package was built with. It
>  is not a means for collaborative development, really.

I already said that it is not meant for collab development, but to check if a 
certain package could bring your troubles since it was badly patched by 
certain contributors, and that package runs happily as supplied by other 
distributors.

> >> In any case, as development proceeds, tools change. patch and diff
> >> were great, once upon a time, just like programming in raw Hex was
> >> still in vogue when I started programming.
> >>
> >> >> I do have a emacs package you can look at for details, if you
> >> >> wish:
> >> >> http://git.debian.org/git/users/srivasta/debian/vm.git> >
> >> >
> >> > Using a modern SCM is wonderful, but please, get back to the
> >> > ground,
> >>
> >> I don't think that using modern SCM's is crazy (if not being on the
> >> ground has the connotations I think it has).
> >
> > Oh, that is the wrong road, believe me. I don't claim that using SCM
> > is crazy, almost everyone uses some, me included. I claim however that
> > *ab*using a modern SCM might degrade the quality of the end product
> > you finally distribute officially, i.e diff.gz in that case. The
> > difference is slight, but crucial. A notable example of well done end
> > product is the one comming from pkg-glibc, you would love to read
> > their clearly identified changes as supplied by their diff.gz.
>
>         You are saying that topic branches are an abuse of a distributed
>  SCM? 

I'm saying that even using a powerful SCM you are currently unable to create a 
simple readable diff.gz and propose a BTS assistance to achive a partial 
success. This is a shame, and seems like adding a turbocharger(s) and losing 
power at the end... but this might improve in the future.

>  If so, I find it hard to take anything else you have to say on the 
>  subject seriously.

Excuse moi, but I also question some of your ideas.

> >> > and think of the possible use cases with what Debian has officially
> >> > released, and if that is what warns a certain level of
> >> > unification. There are users (let's say within restricted areas)
> >> > who can't access random DD repos at will, but rely solely on
> >> > diff.gz supplied by released source CD/DVD media. Please note that
> >> > development history of changes is not of any help here, but what
> >> > exactly has been applied (as logically separated changes) to a
> >> > particular upstream version being released.
> >>
> >> People who want to actually follow development would need some net
> >> access -- or, with the 3.0 (git) format, have their topic branches on
> >> their DVD.  Realistically speaking, do you have any numbers on such
> >> users who must rely on DVD's and can't get someone to burn a DVD for
> >> them?
> >
> > Do you suggest cloning every pkg repo out there and burning that to
> > DVD, then rush over to the company's restricted area passing various
> > identification checks ? It doesn't scale well in my
> > opinion. Realistically speaking there are certain entities (consider
> > some government agencies, but also some companies) who might happen to
> > use Debian in areas where public networks are not allowed, because
> > they are supposed to obey certain policies. And I doubt there is an
> > easy way to gain any hard numbers about such users, since most
> > probably they will pretend they do not exist ;-)
>
>         I am suggesting that I'll be using the 3.0 (git) format as soon
>  as possible, and then it will already be on the DVD.

Sure, that sounds pretty good to me, but would probably take decades to deploy 
all debian source packages 3.0 (git) way, since 3.0 (quilt) is currently on 
topic.

> >> There is a tradeoff. There is correctness of the package , which is
> >> eased by using topic branches (at least, for me), which trumps the
> >> use cases you are talking about. Now, in cases where the branches do
> >> not overlap, there can be a simple conversion to a stacked patch
> >> format; and I'll have no objection to using a tool that can do the
> >> conversion (at the expense of source package size bloat).
> >
> > It sounds pretty fair to me to trade some size for more readability.
>
>         Good. Then you do not object to a .git directory in the sources.

I do not object that.

-- 
pub 4096R/0E4BD0AB 2003-03-18 <people.fccf.net/danchev/key pgp.mit.edu>
fingerprint 1AE7 7C66 0A26 5BFF DF22 5D55 1C57 0C89 0E4B D0AB 


Reply to: