[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

DDE, Debian Data Export



Hello,

After testing the idea and the prototype with my presentation at Fosdem,
it's time to announce DDE (http://wiki.debian.org/DDE), Debian Data
Export.

 * The problem

In Debian, we publish all sort of information, using all sorts of data
formats (often ad hoc and obscure), in obscure places.  Try to think of
an application, for example, that wants to access all this information
together:

 * Maintainer <-> Source package mapping
 * Popcon rankings
 * Changelogs
 * .desktop files of packages not installed
 * What is in the new queue
 * Package screenshots
 * Some statistics [1]
 * Localisation information
 * uscan status
 * Buildd logs
 * sloccount run results
 * Debian Weather
 * Debian Pure Blend specific information

A nightmare, uh?

[1] For example, to pick two recently announced:
    http://ftp-master.debian.org/~joerg/pkg-nums
    http://ftp-master.debian.org/~joerg/arch-space


 * The solution

DDE is a way to make it simple to publish and download data.  The aim is
to be able to access all sorts of Debian information without worrying
about data formats, protocols and access control, and to make it easy to
discover what data is available.

DDE exports data as a big virtual tree.  You can pick a node in the tree
by its URL and download all the data that it contains, in a format of
your choice: currently it supports JSON/JSONP, YAML, CSV and Python
pickled objects.

This means that you can now get Debian data using a trivial HTTP client
tool or library, and read it using commonly available decoders: both
should be available in almost any programming language nowadays.  Since
JSON and JSONP are supported, this even includes JavaScript in Ajaxy web
pages.

DDE is not a competitor to UDD (http://wiki.debian.org/UDD): UDD is
about creating a central location where all the data can be accessed,
while DDE is about giving people a simple way to access data or subsets
of data.

In a way, DDE and UDD complete each other: the more data enters UDD, the
more data is available for DDE.  In turn, DDE gives a simple interface
to the most popular and useful UDD queries.


 * The dream

Here are some hints at what can be done with this:

 * Autocompletion in HTML fields
 * Export data to feed external sites like debtags.debian.net or
   screenshots.debian.net
 * Have a way for package managers to easily access all sorts of data
 * Have a way to implement fancy tools that can query massive data sets
   without needing to download them locally


 * A call for action

You can add data to the DDE tree by just putting a data file in yaml,
json or pickle format under `~/.dde`: I've written a specific guide[1]
to this on the Debian wiki, see: http://wiki.debian.org/DDE/HomeFiles

If you wish to create new and fancy Debian statistics or compute other
sorts of data, or if you already maintain tools that generate Debian
data, including but not limited to, for example:

 * Popcon rankings
 * .desktop files of packages not installed
 * Content of the new queue
 * Screenshots
 * All sorts of statistics
 * Debian Weather

Then please have a look at http://wiki.debian.org/DDE/HomeFiles and
try to publish your data in ~/.dde on merkel.debian.org.

For more complicated cases (like accessing a remote database), it is
possible to extend DDE via python plugins[1]. You can get in touch with
me if you need to go that way.

[1] (http://wiki.debian.org/DDE/WritePlugin)


Ciao,

Enrico

Note: this mail also appreared as a blog post at
http://www.enricozini.org/2009/debian/dde.html

-- 
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@debian.org>

Attachment: signature.asc
Description: Digital signature


Reply to: