[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RDF data about packages in the archive



Hi Andrea,

Quoting Andrea Pappacoda (2025-10-29 19:21:06)
> I'm searching for an RDF "dataset" of all the packages part of Debian 
> (either a given release, or unstable). Do we have something like this?
> 
> I'm not looking for something extremely well-described, just for 
> something which, for example, can be used to obtain a hierarchical view 
> of packages by section.
> 
> I tried searching a bit in <https://wiki.debian.org/RDF>, but haven't 
> find anything useful/still working :/
> 
> Should I just look harder?

Excellent question, and one that I happen to have an answer for, since
I looked at that exact challenge just weeks after we departed in France
:-)

There used to be metadata offered at packages.qa.debian.org, but is no
longer updated since 2024-05-22. When you have ssh access to Debian
hosts, you can fetch the latest generated full dump like this:

    rsync -av packages.qa.debian.org:/srv/packages.qa.debian.org/www/web/full-dump.tar.bz2 .

The full-dump dataset is serialized as RDF/turtle,
~30 MB compressed and 300 MB uncompressed,
and can be bootstrapped and served as a SPARQL endpoint,
e.g. like this:

    sudo apt install oxigraph
    tar xfO full-dump.tar.bz2 | pv | oxigraph load --location oxigraph.db --format ttl
    oxigraph serve --location oxigraph.db

The on-file database above requires ~1.2 GB on disk.

Hope that is of some use.

If you need up-to-date metadata then that's harder but shoul be doable:
I have a perl program that fetches package metadata, that I want to
extend to export as RDF, I just haven't taken the time yet to do that.

 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


Reply to: