[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Kind of OT] Why's this look like gibberish to me?



On Tue, May 06, 2008 at 05:47:36PM -0500, Kevin Buhr wrote:
> "Douglas A. Tutty" <dtutty@porchlight.ca> writes:
> >
> > What gets me is when a man page is written in english and "'" gets
> > translated as "?", as in can?t or "'" is a square white blob (on a
> > regular VT).  Why couldn't whoever wrote it in english have used the
> > standard english "'" glyph instead of a UTF thingy?
> 
> The problem isn't the manpage author, it's your setup.
> 
> Specifically, you're using a locale that sports UTF-8 encoding, but
wrong.  Lang=C.  I don't have any locales installed.  This is regular
stock VT (no fonts, etc).
> you're using a terminal/font combination that is not capable of
> correctly rendering UTF-8-encoded common typographical symbols used
> for English language text, like the right single quote / apostrophe.

The apostrophe is in standard ASCII in "C".

> If you use a locale based on ASCII encoding instead, those manpages
> will render more correctly (for example, substituting the unsightly
> ASCII vertical apostrophe for its more urbane cousin or writing (C) in
> place of the copyright symbol).  See the bottom of this post if LANG=C
> isn't good enough for you.
> 

Already using LANG=C


> Unlike some people here, I couldn't give a ???? if you, S. Keeling, or
> anyone else wants to use UTF-8 or not---I'm not on any crusade---but
> an environment variable setting of "LANG=en_US.UTF-8" is basically an
> announcement to applications that your terminal is UTF-8 capable.  You
> don't have to run a UTF-8-capable terminal if you don't want to, but
> you shouldn't lie to your applications and then whine about those damn
> foreigners writing manpages incorrectly (just a joke, just a joke).
> 
> In truth, if you look at the manpage source, you'll probably find that
> the manpage authors *have* used the ASCII "'" character for
> apostrophes and right single quotes.  That's because this is the
> encoding convention used in the typesetting language "roff" in which
> manpages are written.  You write `stuff like this' knowing that a
> correctly configured manpage rendering pipeline will convert those
> ASCII backticks and apostrophes into the correct English typographical
> symbols (if the manpage is being printed or being displayed on a
> sophisticated terminal) or at least do the best it can (if it's being
> delivered to an ASCII-only terminal).  If manpage writers were really
> on the ball, they'd use \(lqleft and right double-quotes\(rq too, but
> you don't see too much of that.
> 
> To clarify further, there's nothing English about "'".  If it's
> anything, it's ASCII, not English.  I'm not sure that the ASCII
> standard actually specifies what printable characters, including "'",
> are supposed to look like, but in most fonts with ASCII-compatible
> encoding, the "'" character is rendered as an undirected,
> typewriter-style apostrophe, like a vertical tickmark, and I believe
> this is pretty much universally accepted as the "correct" rendering of
> this character, among those who care about these things.  In
> particular, it is *not* the character used in typeset English text as
> an apostrophe or right single quote.  It's rarely used in English text
> at all, except in historically ASCII contents like email and computer
> plain text files.  It's about as un-English as you can get.  It's very
> ASCII, though.

According to man ascii, its ascii code decimal 27.
 


Reply to: