Re: A small question

To: Michael Sobolev <mss@transas.com>
Cc: debian-sgml@lists.debian.org
Subject: Re: A small question
From: Adam Di Carlo <adam@onshore.com>
Date: 01 Jul 1999 13:43:39 -0400
Message-id: <[🔎] oapv2c8dv8.fsf@burrito.fake>
In-reply-to: Michael Sobolev's message of "Tue, 29 Jun 1999 23:41:09 +0400"
References: <19990629234108.A20329@transas.com>

Michael Sobolev <mss@transas.com> writes:

> I've got a small question: where all these entities come from? :)

W3O, mostly.  See the copyright file in the sgml-data package.

> These make me think that it does not matter whether //HTML suffix is there the
> entities are the same.
[...]
> Aha, at least, this makes me think that these two files are different!  They
> are defining different sets of entities.  BUT, according to
> /usr/lib/sgml/catalog file, the first set of entities can be also referred to
> as to "...//EN".

> So here is my question, how I should treat all this ifnormation?

With caution.  It is possible that I have screwed up and marked as
non-HTML specific what really *is* HTML specific.  Note that the
docbook-xml package contains XML versions of this stuff (XML encodes
entities a little differently .. I think it's implicitly CDATA).

> My main concern (well, it's where this investigatation started from) is entity
> named copy.  If I look into first file I see
> 
>     <!ENTITY copy CDATA "&#169;">

This is a Unicode character definition.

> I see no definition for copy in the second file, while iso-.../ISOnum file
> defines:
> 
>     <!ENTITY copy   SDATA "[copy  ]"--=copyright sign-->

>From <URL:http://www.oasis-open.org/cover/isoEntsExplained.html>, 

| They are "SDATA" entity sets, which means that it is the job of the
| recipient to map them to something locally useful.

> These are different definitions and while in the second case I could process
> this SDATA [copy  ] for producing &copy; in HTML output and \copyright in TeX
> output, I lack this possibility in first case.

Why do you say that?  As far as I am aware there are TeX packages that can handle Unicode.

> Please comment.

Well, basically, the SDATA mappings are entirely arbitrary.
Therefore, for the standard entity-sets which I have shipped with the
sgml-data package, I use the Unicode entity mappings, which is handled
fine by advanced browsers and the SGML tool-chain (nsgmls, jade, etc).

I definately am willing to ship an alternate SDATA style entity sets
for SGML (XML requires the Unicode ones).  I suppose either I could
use a different FPI for that, or else I could even use SGML "marked
sections" and a conditional parameter (i.e., use 'nsgmls
-iuse-sdata-entities ...') to switch between whatever representation
of entities you might want.  In either case, the default, IMHO, should
be the Unicode representation.

I *guess* I prefer the former option (use alternate FPIs) becuase it
seems like we could do it a bit at a time....

For more info read
<URL:http://www.oasis-open.org/cover/topics.html#entities>.

--
.....Adam Di Carlo....adam@onShore.com.....<URL:http://www.onShore.com/>

Reply to:

Follow-Ups:
- Re: A small question
  - From: Michael Sobolev <mss@transas.com>

Prev by Date: Re: ITP: lib XT : XML/XSL transformations in Java
Next by Date: Re: ITP: lib XT : XML/XSL transformations in Java
Previous by thread: Re: ITP: lib XT : XML/XSL transformations in Java
Next by thread: Re: A small question
Index(es):
- Date
- Thread