[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Advocating the use of RDF for Debian's published metadata



>>>>> Matthias Klumpp <matthias@tenstral.net> writes:

[…]

 > It would be very nice, if ftpmasters could tell if they would accept
 > a new format in the archive or if we should stay with RFC822 which is
 > used for nearly everything else already.

 >> Note that the same rationale stands for all metadata to be
 >> eventually published on the Web by Debian servers.

 >> Hope this helps.

 > Thank you for the information... I think RDF would be much more
 > "open" for other people and apps to use, as the data wouldn't be in a
 > Debian-specific format. (I can't imagine yet what others would do
 > with this data, but if more people would use RDF, e.g. other
 > distributors too, having it all in one standardized and extensible
 > format would be something valuable)

	Well, having this data aligned with the RDF model will help
	interoperability, I guess.

	One application I have in mind is that it becomes possible to
	query the Debian Packages and Sources databases using the
	powerful SPARQL language.  In particular, one may quickly check
	if there're any packages that are transitively dependent on A,
	while also immediately dependent on B.  (Yes, grep-dctrl(1)
	helps, but it's not quite as powerful a language as the recent
	edition of SPARQL, not to mention that it's yet another query
	language to learn.)

	However, I believe that it's infeasible to change the native
	format the aforementioned databases, as both it isn't going to
	be easy to implement, and it may bring considerable burden on
	both the Debian users and maintainers.

	Thus, my opinion is that there should be a tool performing
	conversion from the Debian's native database format to some RDF
	representation.  In particular, rdfproc(1) could become such a
	tool, provided that Raptor will be extended to parse RFC 822.

	That being said, I don't see such a conversion as a simple and
	straight-forward process.  In particular, should a package
	stanza be transformed into a named (as per Package:) or blank
	node (with Package: as an explicit relation)?  The Depends: a, b
	and Depends: a | b may both have an RDF list in the object
	position, but how to distinguish between them?  And how a line
	such as Depends: a (>> 0.1) should be expressed?  Should the
	package names be encoded as string literals, or should they be
	transformed into URI's instead?  There're quite a few choices to
	be made by the one volunteering for this.

	These questions were in my TODO list for some time (filed under
	Category: nifty hack, as I'm yet to see any serious practical
	uses for such a thing), but I'm short of spare time these days,
	and won't probably be able to do much, apart from participation
	in the discussions on this subject.

-- 
FSF associate member #7257


Reply to: