Re: Sourceless but useless: how about ignoring some irrelevant	files instead of repackaging?
Dear all,
here is a tentative summary of the thread, followed by my personnal
contribution.
The original question is whether we should allow the package maintainers to
ignore some non-DFSG-free (but legally redistributable) files in our source
packages if we can build binary package without them, instead of repackaging
systematically.
Example of such files are:
 - sourceless PDFs,
 - binaries for other operating systems,
 - some IETF documents,
 - GFDL-licensed documentation with cover texts. 
In particular when using helper tools such as CDBS and git-buildpackage, the
time spent in repackaging can be minimised. Nevertheless it is still debated if
repackaging should be encouraged or not.
Using exactly the sources distributed upstream is a clear advantage when
reviewing a package to see what are the modifications specific to Debian.
However, it was also argued that when taking the point of view of
freedom-checking, it is easier to manage repacked “original” tarballs than
allowing non-free files and having to check if the ignored files can really be
ignored. One more benefit of removing files is that it makes the package
lighter.
Here are three unsorted points from the thread, with my personnal comment in
parenthesis for the first one :
 - When removing binary packages, it is not necessary to rename the original
   tarball “~dfsg” (could somebody confirm/deny?).
 - Policy does not require to document the repackaging in README.source ;
   debian/copyright is the right place for this.
 - If we could ignore some files from the original tarball, could we also
   ignore them in debian/copyright, which is already tedious enough to write?
Part of the thread was more focused on sourceless PDFs, for which it was argued
that they are not forbidden by the DFSG. It was proposed to write a GR to make
this clear if necessary. This goes beyond what I proposed originally (ignore
the PDF and not repack the original tarball containing them), but my opinion is
also that they should not be treated as a non-free material as it is always
possible to recover the text, so I would be most happy if our archive
administrators would change their mind on this subject, and would support a GR
if necessary.
I am sure that many developers can provide examples of rejection of sourceless
PDFs. In this thread, ttf-gfs-{-didot,-baskerville,-olga,-porson} were given as
example. I can also add dialign-t to the list.
http://lists.debian.org/msgid-search/871wamzhy1.fsf@vorlon.ganneff.de
Importantly, whichever the issue of this discussion, repacking will still be
necessary. For instance, there are PDFs that are definitely
non-redistributable. In the packages of the science section they are not
uncommon: is is the scientific articles written by the upstream authors, who
had to cede their copyright to a publishing company. I therefore would like to
give a big thank you to the CDBS and git-buildpackage developers.
I am actually very tempted to migrate to git, and if you do not mind me opening
a sub-discussion, I would like to hear some comments about the following. Since
the git repositories contain all the history, if a non-free file slips in and
is discovered after many commits were made, it will be a pain to extirpate it.
For the moment, non-free software is allowed on Alioth, so if the file is
legally redistributable, it is possible to ignore it. But what will the project
manage that kind of situation after the format ’3.0 (git)’ will be accepted in
our Archive? We have thousands of packages and there is no doubt it will happen
some day.
Have a nice day,
-- 
Charles
(please CC me I am not subscribed)
Reply to: