[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: charsets in debian/control



Steve Langasek writes,

> ... most of the letters you listed here are specific
> to the IPA, which would have no use at all in a
> control file as they're not part of the writing system
> of any natural language.

Ok.

> Encodings and charsets are distinct concepts.  Just
> because the file is specified in UTF-8 *encoding* does
> not mean we suddenly have to start coping with the
> entire Unicode character set.

Right.

> Why, what a lovely straw man you have there.

No comment.

> But yes, non-ASCII Latin-1 chars should not be given
> special status over the national chars found in other
> languages spoken by project members.  Debian should be
> using either ASCII, or Unicode; standardizing on
> Latin-1 makes no sense in a global project.

True.  Look, Steve: mild abuse aside, I agree with you
in every particular.  Nevertheless, I would respectfully
suggest that your criticism underscores my point, which
regards the monstrous increase in complexity which the
full Unicode standard represents.  Consider.  Is it a
bug if Readline cannot echo full bidirectional input?
If Dselect does not appreciate all the non-spacing
characters?  If Less does not regard Tibetan subjoined
letters?  (This is my Tibetan straw man.)

Undoubtedly one might observe that the Tibetan problem
were not really a problem with Less but rather with some
underlying library, but this misses the point---or
rather again it underscores the point.  Unicode solves
what for many of us was not a problem by creating an
entirely new class of problems.  For example, it
requires us to be particular about how we tag our e-mail
attachments...

> ... to properly declare the character set on the
> non-ASCII mails you send.

We can perhaps be pardoned for feeling a little grumpy
about this.

Am I arguing to jettison Unicode?  No; to the partial
extent that I had been arguing it earlier in the thread,
you, Peter, Daniel and Matthew have changed my mind.
However, the typical roster of skills one masters in
contributing broadly to Debian development is already
awesome: C, C++, CPP, Make, Perl, Python, Autoconf, CVS,
Shell, Glibc, System calls, /proc, IPC, sockets, Sed,
Awk, Vi, Emacs, locales, Libdb, GnuPG, Readline,
Ncurses, TeX, Postscript, Groff, XML, assembly, Flex,
Bison, ORB, Lisp, Dpkg, PAM, Xlibs, Tk, GTK, SysVInit,
Debconf, ELF, etc.---not to mention the use of the
English language at a sophisticated technical level.
UTF-8 is neat, but I do not really like Unicode (you may
have noticed this).  Seeking essential simplicity, I
would prefer to keep the full hairy overgrown Unicode
standard from the typical Debian roster of development
skills.  Wouldn't you?

-- 
Thaddeus H. Black
508 Nellie's Cave Road
Blacksburg, Virginia 24060, USA
+1 540 961 0920, t@b-tk.org

Attachment: pgpEhlWygei7I.pgp
Description: PGP signature


Reply to: