Bug#99933: Comments on Unicode
On Fri, Jul 06, 2001 at 04:36:25AM +0100, David Starner wrote:
> > Do you have any idea whether the problems identified at
> > http://support.microsoft.com/support/kb/articles/Q170/5/59.ASP
> > have been resolved?
>
> Are they a problem for us? Windows Code Page 932 may or may not correspond
> to anything that we care about. (At a glance, at least one of each pair that
> both correspond to the same Unicode character is not in the real JIS X
> 0218.)
If it's indeed the case that this is a CP 932 problem and not a shift JIS
problem, and if it's indeed the case that we don't support CP 932, then
I'll agree that this isn't a problem.
> > Prior to Unicode 3.1 the code space was 16 bits.
>
> NO. Since Unicode 2.0, the code space has been 21 bits. The ONLY thing
> that Unicode 3.1 did, is put characters above U+FFFF. It did not
> change the fundamental structure of Unicode in the least.
I stand corrected.
> > Once unicode can act as a super set for every character set we currently
> > support, we can use it as such. Until then, we can't.
>
> If Unicode were a super set for every character set that anyone needs to
> support, it would be worthless and completely unusable.
I didn't say for any character set that anyone needs to support.
I said for every character set we currently support. I hope you see the
difference. [And, as an aside, I should have said "for each character
set that we currently support" -- I understand that unicode doesn't need
to support mixed character set usage before we migrate.]
> However, if we currently support any character set well, it is through
> a Unicode based glibc - I don't believe libc accepts the existance of
> any character set that can't be mapped to Unicode. So arguably, yes,
> Unicode is a super set for every character set we currently support
> well.
Assuming we're using glibc support (e.g. toupper()) for all those
character sets, I'll agree that you have a good point.
On 20010705T133736-0400, Raul Miller wrote:
> > in HTML the language can only be identified in the mime header.
On Fri, Jul 06, 2001 at 11:23:42AM +0300, Antti-Juhani Kaijanaho wrote:
> There is no such thing as a MIME header in HTML.
>
> Besides, HTML does include the lang attribute for most elements. I
> wonder what it's for if not for indicating the language.
I stand corrected.
Thanks,
--
Raul
Reply to: