[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RDF data about packages in the archive



Hi Jonas,

On Wed Oct 29, 2025 at 8:38 PM CET, Jonas Smedegaard wrote:
Excellent question, and one that I happen to have an answer for, since
I looked at that exact challenge just weeks after we departed in France
:-)

What a coincidence!

There used to be metadata offered at packages.qa.debian.org, but is no
longer updated since 2024-05-22. When you have ssh access to Debian
hosts, you can fetch the latest generated full dump like this:

    rsync -av packages.qa.debian.org:/srv/packages.qa.debian.org/www/web/full-dump.tar.bz2 .

The full-dump dataset is serialized as RDF/turtle,
~30 MB compressed and 300 MB uncompressed,
and can be bootstrapped and served as a SPARQL endpoint,
e.g. like this:

    sudo apt install oxigraph
    tar xfO full-dump.tar.bz2 | pv | oxigraph load --location oxigraph.db --format ttl
    oxigraph serve --location oxigraph.db

Unfortunately, oxigraph 0.5.2-1 complains with this error:

   313MiB 0:00:09 [32,9MiB/s] [   <=> ]
   Error: Parser error at line 6991772 between columns 1 and 3: No scheme found in an absolute IRI

Passing `--lenient` seems to work though. RDFGlance complains as well but loads the graph nonetheless.

Hope that is of some use.

Yep! Out of date data is fine for my use case :)

If you need up-to-date metadata then that's harder but shoul be doable:
I have a perl program that fetches package metadata, that I want to
extend to export as RDF, I just haven't taken the time yet to do that.

Please keep us updated on this. Bye!


Reply to: