[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re^16: Debian Metadata Proposal -- draft rev.1.4

Am 03.08.98 schrieb apharris # burrito.onshore.com ...

Moin Adam!

APH> Then you shouldn't call it identifier because that's not what
APH> 'Identifier' means.

Why not?

APH> we're doing here.  Can you explain exactly why you think we need a
APH> unique ID for metadata entities?

Again? One example: if you use a database to store the docreg informations  
you have got a key and a data element. And the keys have to be unique! If  
they#re not unique, this could cause real problems.

In dhelp 0.4.x I#ve got two databases:

    (key)                 (data)
  section, id  ->  lang, relatation_id
  id           ->  title, descrip, file

And for both databases I need unique and pseudo persitent IDs.

APH> > Right and that is bad. I#m working on a translation document, I have to
APH> > ask the maintainer of the original document to release a new version
APH> > with this debian-identifier.
APH> ??  No you can just refer to the pkg/file where you got it from!  If
APH> there's no "Debian-Identifier" then, clearly, you can't use it anyway.

That#s it! That#s why we#ve to force the maintainers to use a unique ID.

APH> No, but identifiers point to files, not to metadata.  You don't seem

Maybe, but is that important? I don#t think so.

APH> How do you intend it to be unique?  How can you enforce that?  Use the
APH> package name?

The package name would be the standard solution. For important things like  
the HOWTOs we could use "HOWTO-<lang>/<doc name>". Or we could give every  
maintainer his own numbers (like the ISBN system).

APH> But what if the package changes its name?  Or what if
APH> the doc is split out into another package?

Ok, this could be problem. But we could solve it.

APH> See, the whole problem with your proposal, and I've said this again
APH> and again, is that you are trying to stuff two kinds of entities into
APH> docreg: persistent names for resources (lets call these m-ids, just
APH> for clarity), and the metadata for resources.  By coupling these
APH> together into docreg files, you are inviting serious trouble.

I don#t think so. You would have the same problems, if you use URNs.  
There#re no real problems with my proposal.

My proposal adds only one necessary tag, to identify the file. DC solved  
this problem by adding the DC information to the document itself. So I  
don#t see a big difference between my and the DC proposal.

Using an identifier as filename is real a bad idea. I don#t understand,  
why you like this idea. The old doc-base has got a id and a file name.

APH>  * remove m-ids once the are created
APH>  * rename m-ids

Why not? Ok you should avoid it, if there#re translated documents. But  
this is not a problem, because all documents are maintained by Debian.

And you proposal has got the same problems.

APH>  * refer to m-ids in packages that are not installed, i.e., on a Debian
APH>    Documentation mirror (relevant to the Relation.* fields)

??? I don#t understand that. If you install only the translated document,  
a system like dhelp shows it as original. Where#s the problem and where#re  
the differences?

APH>  * enforce uniqueness on m-ids, and lack of enforced uniqueness is a bug
APH> in    your scheme

That right and this is one advantages. And again, you proposal enforces  
unique ids, too! If two docreg files add the same URL you have got  
problems with the relations.

APH>  * associate metadata with non-local URLs (i.e., http://www.debian.org/)


APH> Given all these weakness, I just feel that your scheme really isn't
APH> much better than just referring to a URL.

Ok, maybe you don#t need the additional features, but I need them. And  
there#re no additional problems introduced by my solution. So why  
shouldn#t we use it?

APH> Do you see my fundamental point?

I see it and this is the problem of your proposal. You#re talking about  
abstract definitions of words like ID and metadata. And I#m trying to  
define a small and simple file format for our needs. I think it#s not  
important, if some other people have got an other definition for metadata  
or IDs.

We#re talking about a solution for Debian. We#re *not* talking about  
solutions for libraries, books, or the WWW.

For example I don#t think that the DC standard itself is a really good  
design. There#re several things, that should be improved. But of course we  
could use the DC ideas. But why shouldn#t we add additional informations?

APH> docreg files manage *metadata*.

No. That is your definition. We#re talking about a file format to add  
documents (files, descriptions) to a database.

APH> Metadata is information about a resource.  You are proposing to also
APH> manage these things called 'm-ids' in the docreg files.  'm-ids' and

I#m proposing to split the ID. Where#s the problem to store the filename  
in a second line?

APH> You are proposing to require that ever metadata entity in a docref
APH> file contains not only metadata, but also manages these entities
APH> called 'm-ids'.  I feel that m-id management should not be tied to the
APH> actual metadata, and that it is orthogonal.

I don#t understand that. I#m proposing something like that:

 * every book has got a unique ID (called ISBN number)
 * and it has got a local number, that tells the user, where he
   can find the book (the filename in my proposal)

Most libs. use something like my proposal for their books.

APH> When I talk about URNs,
APH> I'm basically saying the same thing as you, but pointing out that m-id
APH> management is it's own beast; as such, it should be globally tracked
APH> and stored and managed on it's own.  Furthermore, my scheme lets you

1.) I#m not talking about real URNs. I#m talking about local unique IDs.

2.) A second mapping system is useless.

APH> intermix normal 'file:' references, 'http:', 'news:', 'mailto:',
APH> whatever.  So it's working with existing standards, whereas you are
APH> blowing them away for no good reason.

I don#t understand that. Why couldn#t you use http: etc. with my solution?  
Where#s the difference? You define the URL in the Identifier field and I#m  
using the File: field. So there#s no difference.

APH> > But with your solution you#ve to maintain it! If to packages add the
APH> > same URL (for example to www.debian.org) you have got a problem!
APH> Huh?
APH> I don't know,

Both entries would have the same identifier using your proposal. How  
should I add them to my database? How should manage relations?

APH> I think I would like to see your scheme presented in

Ok, maybe I#ll post a short description.

APH>  * managing m-ids

There#re several solutions.

APH>  * the "Relation.*" fields

No differences.

Identifier: abc/abc.html
URL: mini/abc.html
Language: fr

Identifier: abc-de/abc.html
URL: mini/abc.html
Relation.IsBasedOn: abc/abc.html
Language: de

Identifier: abc-en/abc-en.html
URL: mini/abc-en.htm
Relation.IsBasedOn: abc/abc.html
Language: en

[ The URL is always relative to the docreg file! ]

APH>  * intermixing local files and WWW pages

Identifier: foo/foo page
URL: http://www.tu-harburg.de/
Language: en

Identifier: foo/foo Seite
URL: foo-de.html
Language: de

APH>  * allowing any given URL (i.e., 'news:comp.os.linux.announce'; would
APH>    you put this in the horribly named 'Files' field?!)

Yes. As I#ve proposed we could rename it. Maybe URL:

APH> Maybe I just don't understand your proposal.

cu, Marco

P.S.: If you#re interested I could release an experimental version of
      dhelp 0.4.x. It#s still not bug free, but it#s working.

Uni: Budde@tu-harburg.de           Fido: 2:240/5202.15
Mailbox: mbudde@hqsys.antar.com    http://www.tu-harburg.de/~semb2204/

To UNSUBSCRIBE, email to debian-doc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Reply to: