Re^4: Debian Metadata Proposal -- draft rev.1.4

To: debian-doc@lists.debian.org
Subject: Re^4: Debian Metadata Proposal -- draft rev.1.4
From: Marco.Budde@hqsys.antar.com (Marco Budde)
Date: 08 Jul 98 19:31:00 +0100
Message-id: <[🔎] 9e9_9807082104@antares.antar.com>
In-reply-to: [🔎] oavhpb1rkx.fsf@burrito.fake
Am 06.07.98 schrieb apharris # burrito.onshore.com ...

Moin Adam!

APH> done, mostly in relation to hamm installation on laptops.  I'm sorry
APH> if you disagree with how I prioritize my time, but I tried to decide
APH> how I could help the project most.

Ok, but you should remember that other people like me have to wait for you  
to finish our standard. In the next time I#ll not have got a lot of free  
time to work on dhelp.

APH> Right now, we could have a 'html2debmeta' which converts DC marked-up
APH> HTML files into a docreg file, and 'debmeta2html' vice versa.  RDF

Do you know HTML files using DC? With a little Perl script it#s always not  
a big problem to convert such informations (see my sgml2dhelp). But no  
question we could use DC.

APH> RDF tools when they're available.  Finally, using a proprietary system
APH> when a perfectly good standard is available seems unwise to me.

Well, I#m not 100% sure, because our project and DCs have got a little bit  
difference goals and needs. For example using the file name as identifier  
is never a good idea. And this is a real problem in the WWW at the moment.  
It#s only possible to add links to URLs but not to a document (like the  
ISDN number).

APH> format.  Did you look at the included example or not?  For one, the
APH> tag set changed, although there's a one-to-one correspondance.  For

That is not true. You (or the DC team) have merged DocumentID and File.  
And this produces new problems. I think the old solution was a lot of  
better.

APH> Problem disappeared with removing formats as an *entity* at all.  All
APH> documents are first class entities.

Yes, that#s a very good solution.

APH> So you keep talking about stuff you said a month ago that I've ignored
APH> but I can't find any such stuff?

My main problem is the position of the files and the URLs.

APH> I'll take this under advisement.  I've tried to organize the document
APH> such that stuff relevant for package maintainers is at the front.
APH> First is the intro, then the description of the elements, then the
APH> file format.  How could I organize it better?

Remove all SCHEME and LANG descriptions. Keep it short. Is really  
difficult to read something like a description in the 822 RFC format.  
These formats are nice for programmers, but not for maintainers.

APH> exercise judgement.  I just suspect you don't have any respect for
APH> following standard or learning from work done in archive management
APH> communities outside this small group.

I don#t have got problems with other suggestions. But I think that  
there#re a lot of bad solutions. For example some standard of the W3 (like  
CSS2) are not really good.

APH> >  *) install-docs script: calls the auto converter, dhelp, dwww, ...
APH> Validates the file before calling anything.  Then will have a hook
APH> mechanism to invoke whatever systems are installed, i.e., dhelp.

It shouldn#t validate the files. This reduces speed. We should (your doc- 
base) offer a small Perl script, that checks the files from debian/rules.

APH> >  *) Markus directory structure as .dogrec.dir
APH> AFAIK, the ddh is a file containing the DDH entries, not a directory
APH> structure, but I might be confused.  Marcus?

?

APH> > ? I can#t see any problems with my idea, I#m using this method in dhelp
APH> > 0.3.x and there#re no problems.
APH> Have some VISION for the future, man!

Yes, that#s why I don#t like your solution :).

APH> Do you *know* what a URN is?  Do you know why URLs are of limited
APH> values, and the ways that URNs can help?

No, I don#t know. Could you explain it or post a URL?

APH> instance, we don't have debian/admin/faq, debian/admin/howto.

But we will have something like general/howto.

APH> > or selfhtml
APH> Never heard of it.  References?

www.debian.org: hamm/doc (it#s written in German!)

APH> Functional arguments please.  What functionality is missing?  What do

DocID *and* file name.

APH> No, you read the file once and then build up your own database.  Why
APH> does it even matter where the file is since you ignore it once it's in
APH> the dhelp database?

See other mails.

APH> Also, what about the fact that dhelp cannot deal
APH> with the fact that if a file changes (i.e., a user using vi on the
APH> file) and the the removal procedure is run, the entry is not in fact
APH> removed.  This seems like a design flaw in dhelp.

This is a design flaw in libdb :). But your /usr/lib/doc-base proposal  
won#t solve this problem.

APH> I felt that the "shadowing" of data, and the errors which creep in
APH> because of that, are the fundamental design flaw of the old doc-base.
APH> That's why I'm talking about a central document store.  If we could

I don#t see the differences, please explain that.

APH> Basically want I'd love to see is what you've done with the database
APH> from dhelp, but made general and standard so every display system can
APH> use it.  What do you think?

Maybe nice, but a system like dhelp needs its own databases. I need a  
special sorting of the data.

APH> I also would like to see the redirection of identifiers to central
APH> locations as well.  Do you have ideas on how to do this?

Could you please explain that?

APH> > What does URN mean? Could you please explain this?
APH> Read materials at http://www.w3.org/Addressing/ .

Ok.

APH> The whole issue with URN is to find a way to address documents in a
APH> way that is not coupled to an individual host and/or file path.

That#s a good idea.

APH> seems like bad system design because you are coupling the file system
APH> location with the resource identifier location.  Converting in and out

Yes, but this is not a problem of my solution itself. This problem is  
introduced by merging docid and file. But which your solution we will have  
the same problems.

If I (as package maintainer) would for example move HOWTO/ to HOWTO/html  
all links to the HOWTOs will be broken with both solutions.

APH> of docreg format would be difficult.  Any storage system would need to
APH> track where it discovered the docreg file, for making absolute
APH> references out of relative ones.

Where#s the problem?

APH> Suppose package foobar, which is only FSSTD compliant, installs a
APH> docreg file in /usr/doc/foobar/foobar.docreg.  Suppose that
APH> 'foobar.docreg' contains an entry for 'foobar.html' (relative, in your
APH> scheme).  As I understand it, in your central document store, you
APH> would have an object (row) with an identifier of
APH> '/usr/doc/foobar/foobar.html'.

Yes, that#s right.

APH> Then suppose the next version of the
APH> pkg foobar is FHS compliant, but the maintainer forgets to run the
APH> removal process at the old location.

This would be a real bug! In this situation he could run "dhelp_parse -r"  
in the new postinst.

APH> Now, as I understand it, when
APH> the new pkg installs, we'll have another object (not replacing the
APH> original) with an identifier '/usr/share/doc/foobar/foobar.html'.

That#s right, but the problem is not limited to different paths! If you  
change one character in the whole entry, you will have two objects in the  
database.

This is not a problem of the file format. This is produced by the limited  
libdb(m). These databases allows only two entries/object (key, data). To  
add all need data, I have to store several variables in one (with a  
structure).

APH> How do we deal with this?  Reap objects without files?

Rebuild the database.

APH> The *real* issue is not how the *identifier* element is encoded, but
APH> What's the solution?

Don#t use the file name as identifier. I#ve to read the DC page again. I  
think that this solution is very bad (maybe I don#t understand it?).

Books are good example. You have got a title (filename) and a number  
(docid/identifier).

cu, Marco

--
Uni: Budde@tu-harburg.de           Fido: 2:240/5202.15
Mailbox: mbudde@hqsys.antar.com    http://www.tu-harburg.de/~semb2204/


--  
To UNSUBSCRIBE, email to debian-doc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Reply to:
Follow-Ups:
- Re: Re^4: Debian Metadata Proposal -- draft rev.1.4
  - From: Marcus Brinkmann <Marcus.Brinkmann@ruhr-uni-bochum.de>
- Re: Re^4: Debian Metadata Proposal -- draft rev.1.4
  - From: apharris@burrito.onshore.com (Adam P. Harris)
Prev by Date: Forwarded message re: Doc distribution format
Next by Date: Re^6: Debian Metadata Proposal -- draft rev.1.4
Previous by thread: Re: Re^4: Debian Metadata Proposal -- draft rev.1.4
Next by thread: Re: Re^4: Debian Metadata Proposal -- draft rev.1.4
Index(es):
- Date
- Thread