[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Is tabular data in binary format acceptable for Debian ?

On Thu, 21 Jan 2010, Charles Plessy wrote:
> It is technically true, but I think that we are drifting. To my
> knowledge, there is no such .Rdata file in R packages.

I haven't checked the archive exhaustively, so I don't know. It's
certainly possible to generate, though.

> The current subject of discussion is tables in binary format.

That may be what you're discussing, but I'm talking about why it's
unreasonable to expect the ftpmasters to know what a relatively
specialized package's on-disk data format looks like, and in which
cases it is a non-lossy transformation of the source, and in which
cases it isn't.

What you're discussing is entirely a non-issue, as far as I'm
concerned, because a non-lossy transformation is just that.

> On the other hand, I am sure that in Debian there are files that are
> similar in spirit to your example.

I'm certain as well, but I file bugs when I find them.

> For instance, I have seen PDF documents with PNG plots for which we
> have not the necessary material to regenerate or modify them, for
> instance Excel or OpenOffice spreadsheets, Gnuplot or R code, and
> source data data – which can be gigabytes big.

In the vast majority of cases, the source is relatively small. No
matter how large it is, it's always a bug[1] when we're not
distributing it.

That said, there are certainly specific cases where the actual source
code can be prohibitively large for Debian to distribute. I wouldn't
have a problem with not distributing such source, so long as it was
publicly available somewhere, and Debian maintained a copy of it.
[Just because it's a bug doesn't mean we have to (or even can!) fix

> To come back to the original problem, I will consider the the .Rdata
> files in my packages free unless our archive administrators reject
> again a package that contains some, since in the case of tables,
> whatever Upstream uses (or not) to generate them, he is not holding
> up information that would give him an advantage over people willing
> to fork.

In the case of epiR, that's correct. But again, I reiterate that it
has to be clear on a case by case basis. Ftpmaster *should* REJECT
packages when it's not clear to them whether source is being
distributed (or otherwise contact the maintainer to get
> Once again, I would like to remind how disproportionate is the time
> that we have to spend for this kind of issues (.Rdata files, PDF
> files, documenting copyrights of source files we do not use,
> repackaging to remove windows executables, …) in order to get free
> software accepted in our free distribution.

If we want to have a free distribution, we have to take the time to
make sure it's free. When upstream has done due diligence, it's easy.
When it's not, we have to. If that's not a goal we share any more,
then it's time to revisit the statements in our foundation documents.

> It kills the fun, sometimes degrades our relations with Upstream,
> and I have not yet seen a user thanking us for doing this.

Consider this email a user thanking everyone who spends time making
sure their packages in main have source available.

Many upstreams care more about making excellent software than they
care about making a excellent free software, and that's something
every maintainer of packages in Debian struggles with from time to

Don Armstrong

1: There's some question whether its required under DFSG §2, so the bug
may not be RC... but it's at least minor severity.
When I was a kid I used to pray every night for a new bicycle. Then I 
realized that the Lord doesn't work that way so I stole one and asked
Him to forgive me.
 -- Emo Philips.

http://www.donarmstrong.com              http://rzlab.ucr.edu

Reply to: