[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: We need a global decision about R data in binary format, and stick to it.

Jeremy Stanley writes ("Re: We need a global decision about R data in binary format, and stick to it."):
> No argument on the first, but the second sets a bad precedent if
> interpreted strongly. For example I have a program which relies on a
> fairly large set of correlative data requiring hours of expensive
> computation to generate. In the source package I include the
> original data on which the resulting tables are based and provide a
> means to regenerate it on the fly at package build time, but disable
> it by default so that it doesn't chew up build resources
> unnecessarily.

That makes sense, and is IMO a good reason for not doing the complete
from-scratch build each time.

> Since I need to generate the correlation data for other (non-Debian)
> users of the software anyway, I ship the generated files in the
> source package too and just include them in the binary package
> (along with instructions and tooling for the end user to be able to
> build datasets they can use to override the default ones provided).
> While my example is Python rather than R, I expect it's
> representative of situations for many scientific tools. Perhaps some
> guidance on when this tactic is or is not appropriate would be
> beneficial.

There should IMO be a standard way to request a source package to do
from-scratch rebuilds for this kind of thing, for QA purposes.


Reply to: