Re: A small question
>That's why I repeat: if we have ISOLat1 characters to output, these should be
>encoded as 2-byte sequences in case of UTF-8. Thus, the output files we have
>at the moment <emphasis>cannot</emphasis> be interpreted as UTF-8, since they
>are not.
Hmm. Ok, this might be a problem. I don't know. It's up to the
application developer (Ardo) to determine what he wants to do.
>You see, the construct \|...\| can be easily cought since it's a special thing
>(`\' in input will be escaped with \ giving \\ in output). Well, in case of
>SDATA-entities, I see how to make use of them.
I don't see why \|...\| just as easily as ©. They are both unique!
Furthermore, if we can get the charset of the debiandoc char stream
sorted out, you can hook up *standard*, already written tools to go
from one char set to another.
>> >One more issues (I just made a more throughly look on entities supplied by
>> >sgml-data. Why some files provide Unicode equivalents for entities and some
>> >proprietary SDATA? Is this by design?
>>
>> There are none that use SDATA AFAIK. YOu might be mixing up sgml-data
>> with some other packages which put stuff in /usr/lib/sgml/entities.
>
>I am sorry to say that the freshly downloaded and unpacked in a separate
>directory sgml-data package has ISO* files that define SDATA-entities.
Yes indeed. This inconsistency seems to be a bug.
>Well, and now returning to `stock' SGML entities. copy, and certain other
>entities (like nbsp, for example) are from ISOnum, while in sgml-data package
>they are defined in both of them (and they are different, BTW).
Some overlap may be ok. ISO defines it -- not Debian!
>As for working out this problem. There are two possibilities: to make use of
>SDATA entities in all programs that come with Debian; or to use some Unicode
>encoding for intermediate/output files.
I opt for unicode. Unless there is a standard that the copyright
circle 'c' glyph needs to be '[copy ]' and not '[copy ]' nor
'[COPY ]', that is, unless I am given a guidelines by which to
distinguish the proper notation from the impostor, I am very hesitant
to do that.
I would like someone to tell me what should be done, using the
standards out there to back up their arguments. I am willing to
provide SDATA encodings but not as the *default* unless they are
defined by some standard and it doesn't break the fundamental
jade/dsssl toolchain.
--
.....Adam Di Carlo....adam@onShore.com.....<URL:http://www.onShore.com/>
Reply to: