Re: Large data packages in the archive
On Sun, May 25, 2008 at 08:18:01PM +0200, Joerg Jaspert wrote:
> So assume we go for solution c. (which is what happens unless someone
> has a *very* strong reason not to, which I currently can't imagine) we
> will setup a seperate archive for this. This will work the same way as
> our main archive does, with a few notable points:
> - It will be solely arch:all, not splitted per architecture. Or, if
> someone presents *good* reasons why a data archive needs to be
> architecture-aware, we will also offer this, but *NO* autobuilder
> support will be provided.
> This is meant as a place for large datasets, and those should be
> arch independent. And would kill many autobuilders (think of binary
> packages having 500, 800 or more megabytes!)
> - It is an own archive, so it needs full source uploads to work,
> every data package you create will be a full source package and you
> have to split the source between this archive and the rest that goes
> into the normal Debian one.
> Any comments?
First, thanks a lot for taking care of this issue.
As you mentioned there had been several discussions about this before.
I'd be curious what you think about a scenario for data source packages
that has been outlined by Anthony Towns before:
Basically it is a virtually empty source package that build-depends on
the binary package that it builds. I made a draft of such a source
package that I use to build a 1GB data package that I host myself
(currently). It is initially built by forcing dpkg-buildpackage to ignore
the build-dep. When the necessary data is not detected during build-time
a small script downloads it from upstream and puts it in the build-tree. This
scenario has the advantage that it prevents doubling the size of the archive
when the data in source and binary packages is identical.
An example package is here (source is just a few kB, but binary is
Would you support/accept such a package?
What about licenses? Is data.debian.org just for stuff that could go in
main or also non-free stuff (above has a non-commerical license)?
> Timeframe for this? I expect it to be ready within 2 weeks.
BTW, what would be the new maximum size of a package for data.debian.org
-- 10 GB? ;-)
GPG key: 1024D/3144BE0F Michael Hanke