[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1026231: debian-policy: document droppage of support for legacy locales



On Fri, Jan 20, 2023 at 7:42 AM Bill Allombert <ballombe@debian.org> wrote:
>
> On Thu, Jan 19, 2023 at 11:47:42AM +0000, Simon McVittie wrote:
> > On Wed, 18 Jan 2023 at 16:30:46 -0700, Anthony Fok wrote:
> > > In their mind, GB 18030 encompasses a lot more than just
> > > a character encoding mapping table.  It is the full support package
> > > (including fonts, display, printing, input methods, etc.) for Han
> > > Chinese and all other minority languages used in China.
> >
> > Preferring to use Unicode does seem to be the direction that all of
> > computing is going in, as a simplifying assumption - for example W3C
> > advice for HTML is "You should always use the UTF-8 character encoding"[1]
> > - and as we know, things that aren't tested usually don't work. So I
> > think the level of functionality for non-UTF-8 locales and encodings in
> > the software we package is going to decline over time, whether Debian
> > wants it to or not.

Re-reading Simon's comment again: Yes, UTF-8 is the ideal, but
supposedly some older Chinese websites are still using "GBK" as
encoding, probably something like:

     <meta http-equiv="Content-Type" content="text/html;charset=gbk">

which has less than 30,000 characters and thus a very limited subset
of Unicode.  And, presumably not everyone has the know how to convert
to UTF-8, the Chinese government wants those unable to at least change
that meta tag to:

     <meta http-equiv="Content-Type" content="text/html;charset=gb18030">

where GB18030, being a Unicode Transformation Format, albeit a
somewhat awkward one, would be able to display any characters in
Unicode.

> It is true for everything. Users know how to pick the software that works for their
> environment. It is not relevant that software they do not use do not support their
> environment.
>
> Telling users to switch to UTF-8 because such and such software they never used
> and were never going to use do not support GB18030 does not make sense.

I have the feeling that many tech-savvy Chinese have already switched
to UTF-8, but then perhaps in some circles there are lots of legacy
GB2312/GBK documents or systems that made GB18030 a necessity, as an
intermediate step to Unicode.

(Not so in Taiwan and Hong Kong, they jump straight to UTF-8 from Big5
or Big5-HKSCS.  For better or for worse.)

> It is like saying the Linux console is deprecated because there are Debian
> packages that requires X or Wayland.
>
> Cheers,
> --
> Bill. <ballombe@debian.org>
>
> Imagine a large red swirl here.

Thanks for the wonderful discussion, Bill!

Cheers,
Anthony


Reply to: