[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Packaging WM themes - question



vlm@norlight.com (Vince Mulhollon) wrote:

> I'll put my summary at the top and bottom of this email...  I want "debian
> data packages" to be real packages that deeply integrate with Debian, not
> simply run a ".zip to .deb format converter" on textfiles.com, and/or a
> simple mirror of textfiles.com.  A .deb of a binary is more than a renamed
> ".tar.gz" and thats what I want from a .deb of a data file.  Nothing deeply
> integrates with a binary .deb based Debian system better than another .deb
> based package, data or binary....

Okay.  Good summary.  I'll follow suit with my own.  I counter that
this is a waste.  I assert that in the vast majority of cases that
building "data" deb file is simply moving the data from one tar bundle
to another.  (Recall that a deb file is simply two tar files, one for
the data, one for the meta-data, combined into an ar archive.)  Our
resources can be better spent if we simply provide pointers to what is
already out there.

Currently on this list there is talk about building yet another
installer package.  That is fine and a good approach, but my proposal
is to avoid reinventing the wheel and provide a data installation
understructure, upon which real packages can build.  I think that
augmenting dpkg with the ability to grab data, install it, and track
its contents would be a win for all of these packages.  Furthermore,
if we can avoid having to build everything into a stupid deb file,
then we can cover far more material with far fewer resources.

> Regarding the double hit on disk space, "simply" change the system to not
> create errors on a "data" section package if no .orig.tar.gz is
> uploaded, ...

How does one rebuild the deb then?

> or working the "opposite" way, don't allow .debs in "data" section, only
> .orig and .diff (and .dsc I suppose)

How does one install the package then?

See ... the way the whole deb system works must be changed somehow,
no matter what you do.

> I would still push for filing the metadata where the copyright of the data
> itself lies.  I like the idea that if I install a debian box using only
> stable/main I can guarantee that there are no license encumberances on the
> machine of any sort, its all DFSG.

That is a matter of separating the data section into
main/contrib/non-free/etc., which I had assumed would be done
regardless.

> I use the term ".diff" to stand for whatever you call any modifications to
> the data, including the addition of metadata, and ".deb" to stand for
> whatever you call the thing you upload that provides the metadata to
> Debian.

I propose to keep the meta-data in some sort of database.

> >> There is no deb.  We mirror the tar.gz file.  That's all.
> 
> Typically a binary program will have a source and a diff full of metadata
> (control file, copyright file, ...) and a .deb that is ready to install.
> 
> Your mirroring the source and a diff/metadata/whatever you call it, is not
> much different.
> 
> If you mirror the tar.gz, you are storing an equivalent set of data
> (admittedly, with an extra source file) but in a different format.

True.  The big win of using a tar.gz, however, is that we are not
required to mirror the file.  We can instead provide a pointer to the
data instead of being required to download it and store it ourselves.
If all is required to install the data is to unpack it, why do we need
a deb for that?  Modify dpkg to use the tar.gz itself.

> I'm not sure replicating the procedures and tools for .deb would be easier
> than a "simple" patch to the .deb system that merely does not require an
> .orig file for a package in section "data".  Or work it the other way, if a
> package is in section "data" do not allow a .deb file to exist, make the
> user download the .orig and .diff and the user has to "compile" it (not
> much effort to compile a text file.

I'm not saying that my approach is easier.  I am saying that the extra
effort is rewarded by certain advantages.

> >> Hmm ... when something changes is it easier to (1) download, rebuild,
> >> upload; or (2) change a link.  In my opinion, it requires less work to
> >> do (2).
> 
> How are you proposing to fix the metadata?  You still have to download the
> old metadata, change the link, and upload the new metadata.

The meta-data should be small enough to not require its own file.
Consider a package with no maintainer scripts (which should be the
case for a package with data only).  The only meta-data is the control
file, and that is already stored in a database of sorts, the
Package file.  Therefore, keep the data in its tar file, place the
meta-data in a database.  That is what we should do.  

> The closest comparison I can think of is a .deb of a binary changing it's
> upstream URL.  You'd apt-get source package, dch, change the URL, debuild,
> dupload.  In comparison to your system where you'd "download metadata",
> change the URL, "upload the metadata".  Same tasks, fetch it, fix it,
> distribute it.

The only change is to modify the link in the database.  Which is
simpler than uploading and downloading files.  It could even be done
with some sort of web interface.

> You'd be building a set of metadata that is appropriate for the raw data,
> like URLs, copyrights, dependancies, control files.   You'd be uploading
> the metadata into the debian system somehow so debianites can use it.
> 
> I interpret your position as the metadata should be contained in a new and
> separate system from the one we use for binaries.  I'm arguing that the
> system we use to store binary metadata would work well enough for "raw
> data", and could be much more efficient with some really minor
> modifications...

You're not thinking outside of the "deb package" mentality.  There are
other ways of doing things.

We have packages in stable that are simply copies of the Pluto online
Journal.  To have packages like this is silly, of course, but someone
thought it would be nice if Debian users could grab issues of this
journal conveniently using Debian's package manager.  I agree with
that point, which is why I am proposing this scheme.  With my
proposal, we would not have to package the journals ourselves, but
instead could provide pointers to the journals, which are stored on
the upstream site.  Then Debian users would have the convenience of
grabbing this "data," installing it, and using Debian's system to
track the data.

> The decision of make a tiny change to the present infrastructure, or
> create a near duplicate of our current infrastructure with different
> names for the same concepts and procedures.

But the new infrastructure (which if implemented properly can work
very well with the existing infrastructure), can provide useful
advantages not capable with the present system.  That is why I think
we should look into another way of doing things.

> >> That is not what I proposed.  My proposal is for a tool which downloads
> >> a tar.gz of data and installs it directly onto the system.  No deb is
> >> created.  There is no diff file.
> 
> So, you want netscape to download and decompress a file when you click on a
> URL?  You can do that with no modifications.  Add a mime type and a handler
> for it, maybe.

And netscape tracks which files it installed?  Will it also uninstall
these files when I no longer want them?

> I would prefer a method of integrating the data deeply into the Debian
> system.  Dependancies on other data files, Dependancies on binary packages.

Fine.  Dependencies can be part of the new system.

> Copyright files.

If the data is copyrighted, the copyright will be contained in the tar
file.  The type of copyright can be specified in the meta-data.

> Man and info pages describing data formats, describing each
> individual data file.

Each individual data file?  I believe you misspoke here.  Certainly,
you do not want a "package" with 400 data files to also have 400 man
pages.

If the data files are in a format to be used by a program, then the
program's package should supply the man page describing the format of
the data file, not the package containing the data file.  Therefore,
these things will be in a deb package already.

> Control files with useful sections for the data.  Conformance to the
> filesystems directory structure.

In general, the filesystem's directory structure mandates that data
should be kept together under a single subdirectory.  If the data
belongs together, why scatter it throughout the filesystem?

> Management of data sets using the fancy .deb tools instead of ftp
> and tar.

A deb file is simply a fancy tar archive.  Provide a tool that can
unpack a tar and track which files belong to that tar.  That should
provide all of the functionality that you need.

> Tracking of bugs (everything from faulty data to typos to copyright
> violations) in the BTS.

It does not have to be a package for it to be tracked by the BTS.

> Automatic updates of data and "living documents" using apt-get
> (install debian weekly news once, read the latest in your MOTD
> forever).

Automatic updates occur when the database of meta-data is updated.

> Data viewers and data manipulators added to the menu system.
> Debconf ability to run a translator on the raw data (tar.gz is a
> .sgml and you want a .txt generated, debconf to the rescue!).  For
> data like satellite, tides, and astronomical data, use logrotate on
> the data.  Add mime types and mime handlers when an unusual data
> format is installed.  I want an icon for the new mime type
> registered with kde and gnome, if they are installed.  Add entries
> or a whole nother dictionary to the dict system if necessary.
> Locales support for translations.  Additions (if any required) to
> config files of other programs (extensions to emacs to read those
> data files).

Spurious arguments.  These are all issues for the package containing
the program that uses the data, not with the data themselves.  For
example, if you have an extension to emacs to read these files, then
package the reader in a conventional package and require that the data
package depends on the emacs reader.  It is how you would do this in
any case, even if "data" deb packages are used.  The reader would not
be packaged *inside* the data package.

> Symbolic links of appropriate graphics into the backgrounds
> directories of kde and gnome if installed.

I had proposed that the new scheme would be able to provide symbolic
links.  This is possible.

> In summary, I want "debian data packages" to be real packages that deeply
> integrate with Debian, not simply run a ".zip to .deb format converter" on
> textfiles.com, and/or a simple mirror of textfiles.com.  A .deb of a binary
> is more than a renamed ".tar.gz" and thats what I want from a .deb of a
> data file.

But a true .deb of a data file (if done correctly) is just that: a
renamed tar.gz file.  The only thing else in it is a control file,
which is present only so that it can be incorporated in our Packages
database.

If all we have is a tar and a database, then why do we need the deb?
Why do we need to store the tar if someone else is already providing
it?

> Nothing deeply integrates with a binary .deb based Debian
> system better than another .deb based package, data or binary....

That is true now, but please don't tell me that this always has to be
the case.

- Brian



Reply to: