On 10/16/2013 01:37 AM, Andreas Tille wrote:
Hi Martin, On Sat, Oct 12, 2013 at 09:59:35AM -0700, Martin Morgan wrote:On 10/12/2013 02:59 AM, Maintainer wrote:Hi, the Debian Med team tries to package several parts of BioConductor. When trying to upload GenomicRanges our ftpmaster criticised that the source contains some precomputed results inside the documentation which is in conflict with our policy which requires the source for all binary data. There could be different solutions for this: 1. If you consider the files GenomicRanges/inst/doc/precomputed_results/*.rda as not very important for the user documentation and it might be sufficient to download the files from somewhere else. 2. Provide a recipe to reprodce the precomputed results we could use in the package building process to recreate the data. May be there are other solutions but these come to my mind for the moment. Any hint what we should do?Andreas -- you've brought this topic up before; you've provided guidance at https://wiki.debian.org/GNU_RSure, I know and I really hoped that this means would be convincible enough to our ftpmasters - but unfortunately it did not (see link to ftpmaster decision on this page). We kept on dicussing the issue with ftpmaster and they just came up with their stronger than hoped requirement.Basically, these are serialized R objects, so their content is transparent to users in the same way that a binary image is visible (and useful) to a user.I'll try uploading with the other explanation given by Hervé Pagès and hope this will pass. Sorry for bothering you about this and thanks for your patience
For precomputed_results above, it looks like these could be generated by a script, but the specific results depend on a web service query and the web service changes from time to time. So the script will become out-of-date, creating data that are no longer consistent with the illustrative puruposes of the vignette. Also, the time cost of generating data is not consistent with our (nightly) build process; we will not generate this data on the fly, and it would be a mistake for your release process to generate data different from the data used in our release. These (expense of computation, consistency of external data sources) are typical reasons.
When the 'affy' maintainer recieves one of these emails, and the email mentions three data sets, and the three data sets are documented in the man page as data sets from an experiment (e.g., ?SpikeIn), what is one supposed to do? Or rather, why is he being contacted in the first place?
From a non-technical perspective: (1) It's presumptuous to suggest that the data files are not important for user documentation; if they where not important why would the author have gone to the trouble to include them in the first place? (2) If you are going to contact our maintainers, then please let me know about the extent of the contact and the intention; I would rather have a discussion on our developer mailing list than have each maintainer wondering how to react.
Martin
Andreas.
-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793