[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#198569: [ITP]: r-noncran-design -- Regression modeling strategies



On Tue, Jun 24, 2003 at 09:52:12AM -0500, Douglas Bates wrote:
> Dirk Eddelbuettel <edd@debian.org> writes:
> 
> > On Tue, Jun 24, 2003 at 02:31:53PM +0200, Andreas Rottmann wrote:
> > > Dirk Eddelbuettel <edd@debian.org> writes:
[... cut to preserve some space as we're getting more generic here ... ]
> > > I think that 'design' is, also as a source package name, way too
> > > generic. You can't in any way defer what this source package is
> > > about... The same applies (but not as much) to hmisc, IMHO. Why not
> > > name the source packages the same as the binary packages?
> > 
> > a) Transparency, so 'name it the same as upstream'. CRAN packages have their
> > own little conventions and infrastructure. IMHO we gain little by adding
> > another layer of complexity.
> > 
> > b) Precedence. We already have 7 or 8 R add-on packages. Several of these do
> > the same thing. In fact, mine do -- whereas Chris Lawrence's don't. Doug Bates

[ That needed a correction, sorry. See Chris' post in the BTS if you're
  curious. ]

> > plans to release some too. Some uniformity would be good.
> > 
> > Comments, please?
> 
> I think the policy of maintaining the upstream source package name for
> the Debian package is going to cause more and more problems of this
> type.  One of the Omegahat packages for R is called XML.  I'm sure
> there will be a flood of Debian bug reports if anyone tries to
> upload a Debian source package called XML that contains the code for
> an R package.
> 
> The problem with the source package names stems from the fact that
> both the Debian packaging system and the R package-building mechanism
> require a particular directory to have a particular name, and those
> names conflict.  Whenever Dirk and I discuss this with either the
> Debian folks or the R folks we are presented with the same "simple"
> solution - have the other group change their packaging system.  I
> don't think it would be easy to do this in either case but I do think
> that the best long-term solution is a change in the R package building
> mechanism, as I describe below.
> 
> Just to make it clear what happens:
> 
> When building a Debian package of release 1.1 of a system called
> "foo" the Debian package system expects the directory structure
>  .
>   /foo-1.1            # original sources
>           /debian     # Debian-specific files such as rules, control, etc.

I think you are confounding the issue here. There are three layers:
- directory name
- source package name
- binary package name

They can all be different. The source package name comes from
debian/changelog and needs to match the source tarball, the binary package
name comes from debian/control and is *not* required to be aligned with the
directory or source name. The directory name comes from itself.

Generally speaking, and as is the case with many multi-binary packages, I
can build a package 'foo' based on a source 'bar_*orig.tar.gz' in a
directory 'zilch'.  Unless, of course, it is an R package. Why?  Well, for R
packages, I run "R CDM -c -l $(TMPLOCATION) ." so the directory name
matters. This imposes a restriction, and as you suggest below, it would be
nice to see this changed in R CMD.

Other than that, I see two (mostly) non-overlapping problems here. One has
to do with how we name .deb packages and their components, and another about
how R builds packages.

Please correct me if I state anything wrong or ambigiously. 


> The file ./foo_1.1.orig.tar.gz must contain the original sources as
> downloaded from the repository.  One is allowed to change the name of

Right. To retain identical md5sums, if possible. This allows for renaming,
lower-casing, ... 

> the top-level directory but that is the only change allowed (I think).

Yes, _after_ you untar and without re-taring so that the upstream md5sum is
unchanged. Doping this generates a warning during package built. And
becauses you can rename, directory foo-1.1/ may become foo/.

> Any other changes in files for building a Debian package are
> incorporated into the .diff.gz file for the Debian package.
> 
> This means that we cannot download a tar file like Design-1.3-1.tar.gz
> from an R archive, stick it in a directory called
> ./r-noncran-design-1.3/ then invoke the R package building mechanism on
> Design-1.3-1.tar.gz.  The downloaded sources must expand to the
> directory in which the Debian build process is run.

Not sure I follow. I did the following:

-- download Design-1.3-1.tar.gz, renamed it to design_1.3.1.orig.tar.gz [
lowercase, upstream 1.3-1 collapsed to 1.3.1, '_" as name and version sep.
char, orig.tar.gz as suffix ]
-- untar it, it defaults to using the Design/  No change made here!!
-- add four small files in Design/debian
-- build package

and it can be loaded as library(Design) as every R user would expect.

> So according to the Debian conventions the top-level directory for the
> r-noncran-design package should be named something like
>  r-noncran-design-1.3

I am still not sure why.

> The build process for the Debian package is run in the top-level
> package directory which should be the expanded tar.gz file.  In the R
> convention the name of the directory formed by expanding the tar.gz
> file is the name of the package, without any version information.  If
> we call the directory r-noncran-design-1.3 then the R package gets
> named r-noncran-design-1.3 when, according to the R conventions, it
> should be called 'design'.  Users will have software that contains
> calls like
> 
>  require("design")
> 
> not
> 
>  require("r-noncran-design-1.3")
> 
> At present we have two alternatives:
> 
> 1) Name the Debian source package according to the R package name, as
> Dirk suggests.  I don't think this is a viable long-term strategy.
> The names, like "XML", are too vague.

I agree that 'xml' is bad. But let's not spill the baby with the bathwater.
I'd rather rename a _few_ clear clashes than to point-blank require all
packages to be renamed. Let's call XML omeghat-xml. Maybe call design
harrell-design.  But why would we need to rename tseries to cran-tseries?
I think that would go overboard.

And note that 1) has not implications on directory names.

> 2) Name the expanded directory according to the Debian package name,
> build the R package under the wrong name, then rename a bunch of files
> in ./debian/tmp/usr/lib/R to the correct name of the R package before
> building the Debian package.  This works - sort of.  The preformatted
> help pages get messed up by this process.
> 
> My suggested way out of this is to expand the R package installation
> mechanism to allow the package name to be other than the name of the
> directory containing the package sources.  It could be overridden
> within the DESCRIPTION file or on the command line for the R CMD
> INSTALL call.
> 
> Kurt and Fritz: Is it clear why I want to override the name of the R
> package during the build process?  I have looked at the shell script 
> for R CMD INSTALL and it seems that the code for package bundles
> already allows R_PACKAGE_NAME to be different from R_PACKAGE_DIR in
> the do_install_source function.  Would it be feasible to allow the
> package name for a simple package (i.e. not a bundle) to be different
> from the directory name?  Would it be reasonably easy?


Dirk

-- 
Don't drink and derive. Alcohol and analysis don't mix.



Reply to: