[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#198569: [ITP]: r-noncran-design -- Regression modeling strategies



Dirk Eddelbuettel <edd@debian.org> writes:

> On Tue, Jun 24, 2003 at 02:31:53PM +0200, Andreas Rottmann wrote:
> > Dirk Eddelbuettel <edd@debian.org> writes:
> > 
> > > Package: wnpp
> > > Severity: wishlist
> > >
> > > * Package name    : r-noncran-design
> > >   Version         : 1.1.6
> > >   Upstream Author : Frank Harrell <fharrell@virginia.edu>
> > > * URL             : http://hesweb1.med.virginia.edu/biostat/rms
> > > * License         : GPL
> > >   Description     : Regression modeling strategies
> > >
> > > Design is one of two packages by Frank Harrell and requires the other, Hmisc.
> > > Design provides the code supporting Harrell's 2002 book on 'Regression
> > > Modeling Strategies'.  I intend to stick with the convention of calling the
> > > (Debian) source package the same as the (source) R package -- design -- but
> > > then normalizing on r-noncran-design as done by prior packages maintained by
> > > Chris Lawrence and myself.
> > >
> > I think that 'design' is, also as a source package name, way too
> > generic. You can't in any way defer what this source package is
> > about... The same applies (but not as much) to hmisc, IMHO. Why not
> > name the source packages the same as the binary packages?
> 
> a) Transparency, so 'name it the same as upstream'. CRAN packages have their
> own little conventions and infrastructure. IMHO we gain little by adding
> another layer of complexity.
> 
> b) Precedence. We already have 7 or 8 R add-on packages. Several of these do
> the same thing. In fact, mine do -- whereas Chris Lawrence's don't. Doug Bates
> plans to release some too. Some uniformity would be good.
> 
> Comments, please?

I think the policy of maintaining the upstream source package name for
the Debian package is going to cause more and more problems of this
type.  One of the Omegahat packages for R is called XML.  I'm sure
there will be a flood of Debian bug reports if anyone tries to
upload a Debian source package called XML that contains the code for
an R package.

The problem with the source package names stems from the fact that
both the Debian packaging system and the R package-building mechanism
require a particular directory to have a particular name, and those
names conflict.  Whenever Dirk and I discuss this with either the
Debian folks or the R folks we are presented with the same "simple"
solution - have the other group change their packaging system.  I
don't think it would be easy to do this in either case but I do think
that the best long-term solution is a change in the R package building
mechanism, as I describe below.

Just to make it clear what happens:

When building a Debian package of release 1.1 of a system called
"foo" the Debian package system expects the directory structure
 .
  /foo-1.1            # original sources
          /debian     # Debian-specific files such as rules, control, etc.


The file ./foo_1.1.orig.tar.gz must contain the original sources as
downloaded from the repository.  One is allowed to change the name of
the top-level directory but that is the only change allowed (I think).
Any other changes in files for building a Debian package are
incorporated into the .diff.gz file for the Debian package.

This means that we cannot download a tar file like Design-1.3-1.tar.gz
from an R archive, stick it in a directory called
./r-noncran-design-1.3/ then invoke the R package building mechanism on
Design-1.3-1.tar.gz.  The downloaded sources must expand to the
directory in which the Debian build process is run.

So according to the Debian conventions the top-level directory for the
r-noncran-design package should be named something like
 r-noncran-design-1.3

The build process for the Debian package is run in the top-level
package directory which should be the expanded tar.gz file.  In the R
convention the name of the directory formed by expanding the tar.gz
file is the name of the package, without any version information.  If
we call the directory r-noncran-design-1.3 then the R package gets
named r-noncran-design-1.3 when, according to the R conventions, it
should be called 'design'.  Users will have software that contains
calls like

 require("design")

not

 require("r-noncran-design-1.3")

At present we have two alternatives:

1) Name the Debian source package according to the R package name, as
Dirk suggests.  I don't think this is a viable long-term strategy.
The names, like "XML", are too vague.

2) Name the expanded directory according to the Debian package name,
build the R package under the wrong name, then rename a bunch of files
in ./debian/tmp/usr/lib/R to the correct name of the R package before
building the Debian package.  This works - sort of.  The preformatted
help pages get messed up by this process.

My suggested way out of this is to expand the R package installation
mechanism to allow the package name to be other than the name of the
directory containing the package sources.  It could be overridden
within the DESCRIPTION file or on the command line for the R CMD
INSTALL call.

Kurt and Fritz: Is it clear why I want to override the name of the R
package during the build process?  I have looked at the shell script 
for R CMD INSTALL and it seems that the code for package bundles
already allows R_PACKAGE_NAME to be different from R_PACKAGE_DIR in
the do_install_source function.  Would it be feasible to allow the
package name for a simple package (i.e. not a bundle) to be different
from the directory name?  Would it be reasonably easy?



Reply to: