[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How much data load is acceptable in debian/ dir and upstream



Hi Russ

On Mon, Sep 14, 2020 at 12:21:10PM -0700, Russ Allbery wrote:
> > I think we should try to document somehow, when there is a need for some
> > separate source package.  I would agree if the code is some kind of
> > moving target and data would not change or if there is some kind of
> > versioned downloadable tarball or the data can be shared between
> > different software package.  But here none of these conditions is
> > fulfilled.
> 
> Is there any overlap of the test data required by different packages?

In the packages that where rejected this is not the case.  We are
actually considering to create some kind of universal data set to be
used in several packages.  However, this is a tough task since there are
several different data formats and sometimes software is dedicated to
very specific data.

> I'm
> wondering if it would make sense to create a new native Debian package
> called debian-med-test-data or something like that, and put all of the
> data used for package test suites in that package.  The tests can then
> depend on it.

We definitely keep this in mind.

> That may be a little inefficient for autopkgtest because it will need to
> download more data than is necessary to test a specific package, but a bit
> cleaner for the archive since it collects a class of data in one place and
> provides a natural place for supporting documentation, copyright
> information, and so forth.  It also provides a logical place to put
> supporting scripts to, say, refresh the download or restructure the data
> if required for different packages.  It feels a bit more self-documenting
> and obvious what's going on.

The problem is not only for autopkgtest.  As I said we try to enable
users to run the test suite on their local machines as kind of examples
or simply to prove that their machine behaves like the tested behaviour.
In this case also a big package would need to be installed on users
machines which is in most cases not really needed. 

> That also avoids the hassle of having to maintain a bunch of separate test
> data packages (although one could of course also do that) by collecting
> the packaging in one place.

I agree that this would avoid that hassle in the current cases (except
the one where upstream had provided the data inside the tarball
(graphbin).  However, I think it is just a temporary solution for
the question my mail might boil down:

  Provided that license and copyright of the data in question is OK
  is there any size limit for data to be stored under debian/?

I think we should answer this question and write the answer down in
our documents.

Kind regards

      Andreas.

-- 
http://fam-tille.de


Reply to: