Bug#99933: Bug#99324: Default charset should be UTF-8
> > I disagree. The Han Unification issue is more like the difference
> > between the latin and the italic character sets. Yes, many characters
On Mon, Jun 11, 2001 at 07:20:21PM +0200, Radovan Garabik wrote:
> No, because latin (upright) and italics are used interchangebly,
> whereas fraktur carries implicit connotation of language used -
> just like different glyphs for unified CJK charset.
I'm sorry. Not italics, but Old Italic. U10300-U1032F.
This includes letters like U10308 OLD ITALIC LETTER THE (a circle
with an X in it) as well as letters like U10301 OLD ITALIC LETTER BE
(essentially the same as a capital roman B).
Here, we could assume a common history, and define a map which relates
many of the characters.. much as has been done with Han Unification.
> > are similar, however there are also some characters which are unique
> > to each representaiton.
> > Also, Unicode does include Fraktur characters.
> but in mathematical symbols - that is a completely different beast
Please explain why it matters to the reader whether the letter A is
classifed by the unicode consortium as mathematical [or not]?
> > > I am really not sure if unicode went the right way, I feel the
> > > ability to display Chinese name in a Japanese document using
> > > Chinese glyphs (or vice versa) is something that should not be get
> > > rid of...
> > And, this could be rectified -- with Unicode 3.1, they have the code
> > space to represent each major representation of the character set.
> if only they instead of talking how bad is unicode started working on
> improving it (duck, run :-))
I don't have the technical skill nor the political connections to properly
contribute to the unicode consortium. I can, however, point out major
problem areas, and I like to think of that as valuable [at least to
Debian -- I like to think that the members of the Unicode Consortium
are already aware of these problems].
> > > perhaps it should consider them to be different scripts with
> > > different encodings, but when would it stop? Making italics,
> > > boldface etc. to be different characters?
> > Unicode already does that. Take a look at the mathematical
> > alphanumeric symbols [1D400-1D744]. For example: 1D400 MATHEMATICAL
> > BOLD CAPITAL A
> the reason and purpose of these characters is quite different from
> "base" unicode characters
The point is that unicode already does support the things you were
suggesting as more unreasonable than indicating oriental language.
> > > As for X11, fonts are being rapidly developped.
> > For currently relevant policy it matters what actually works.
> of course. That's why my proposal is very mildly worded and gives a
> lot of freedom to maintainers to decide what charset they want.
> > > > "Package may (at the discretion of the maintainer) include
> > > > documentation files in other encodings, if they are present also in
> > > > canonical encoding, and if the encodings used are clearly marked.
> > > > If a particular font is required, that should be clearly marked."
> > >
> > > You do not know what is a particular font... one of
> > > (traditional|simplified)C,J,K, or the full font name?
> > I'm not sure I understand this question (I don't know enough about
> > oriental languages and fonts to give a full answer in any event).
> well, would you indicate just "this README needs japanese unicode font"
> and the user has to figure out by himself what is that
> or "this README needs -misc-fixed-*-*-*-ja-*-*-*-*-*-*-iso10646-1"
> and the user is fubar when he does not have that font.
I think "needs japanese unicode font" might suffice. Perhaps a package
name which includes that font would also be good. An X font spec would,
of course, be necessary if you wanted a program to "just work".
It depends on context.
> > > More appropriate example from the history is the war between
> > > EBDIC, ASCII and other proprietary encodings... thanks god one and
> > > only one encoding won.
> > ebdic vs. ascii wasn't about supported languages.
> true, but the mess in encodings was quite comparable to what is there
> today outside of Latin-1 world. And the peace ASCII brought could be
> compared to peace that (hopefully :-)) unicode brings one day.
I'll accept your analogy. (In the name of peace :).