[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UTF-8 in jessie

Quoting Ian Jackson (2013-08-29 13:56:09)
> Adam Borowski writes ("Re: UTF-8 in jessie"):
> > Let's take a look at some sheets.
> Last time I looked at this I found a copy of the actual ASCII 
> standards document from 1968 or so and it did mention this usage.
> > > I don't think that better UTF-8 support should involve needlessly 
> > > converting 7-bit ASCII text files which use ` ' as matched quotes, 
> > > into UTF-8 text files which use non-ISO-646 codepoints.
> > 
> > These code points are defined to be exactly the same in both ASCII 
> > and Unicode.  Only fonts may differ.  And like Han unification 
> > issues, this is out of scope here.
> Do you intend that text files containing uses of ` ' as matched single 
> quotes should be changed to use non-7-bit BMP matched single quotes ? 
> It seems that you don't.
> In which case I'm afraid you will have to make this explicit somehow 
> in your proposal.  Otherwise zealous people will go around complaining 
> about funny-looking quotes and changing a whole bunch of text files to 
> no longer be 7-bit.
> See GCC's error messages, for a case in point.

I believe the underlying issue is the one summarized here: 

If that is correct, then the issue here is not whether ASCII "`" equals 
UTF-8 "'" (or some similar recoding), but instead that _authors_ from an 
era of looking at output representing ' as ` grew a habit of typing back 
into documents that other character.

How about we simply mention explicitly that `arcane quoting' - even if 
arguably related to UTF-8 encoding, should be classified not as 
release-critical bugs but as spelling errors.

 - Jonas

 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

Attachment: signature.asc
Description: signature

Reply to: