Re: RDF data about packages in the archive
Hi Andrea,
Quoting Andrea Pappacoda (2025-10-29 19:21:06)
> I'm searching for an RDF "dataset" of all the packages part of Debian
> (either a given release, or unstable). Do we have something like this?
>
> I'm not looking for something extremely well-described, just for
> something which, for example, can be used to obtain a hierarchical view
> of packages by section.
>
> I tried searching a bit in <https://wiki.debian.org/RDF>, but haven't
> find anything useful/still working :/
>
> Should I just look harder?
Excellent question, and one that I happen to have an answer for, since
I looked at that exact challenge just weeks after we departed in France
:-)
There used to be metadata offered at packages.qa.debian.org, but is no
longer updated since 2024-05-22. When you have ssh access to Debian
hosts, you can fetch the latest generated full dump like this:
rsync -av packages.qa.debian.org:/srv/packages.qa.debian.org/www/web/full-dump.tar.bz2 .
The full-dump dataset is serialized as RDF/turtle,
~30 MB compressed and 300 MB uncompressed,
and can be bootstrapped and served as a SPARQL endpoint,
e.g. like this:
sudo apt install oxigraph
tar xfO full-dump.tar.bz2 | pv | oxigraph load --location oxigraph.db --format ttl
oxigraph serve --location oxigraph.db
The on-file database above requires ~1.2 GB on disk.
Hope that is of some use.
If you need up-to-date metadata then that's harder but shoul be doable:
I have a perl program that fetches package metadata, that I want to
extend to export as RDF, I just haven't taken the time yet to do that.
- Jonas
--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
* Sponsorship: https://ko-fi.com/drjones
[x] quote me freely [ ] ask before reusing [ ] keep private
Reply to: