[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Invalid sequences in https://www.debian.org/doc/debian-policy/policy.txt



On Sat, 12 Apr 2025 at 21:02:14 +0200, Jérémy Lal wrote:
Le sam. 12 avr. 2025 à 13:27, Bill Allombert <[1]ballombe@debian.org> a écrit :

   On Fri, Apr 11, 2025 at 07:21:47PM -0700, pF arQon wrote:
   > I can't infer the cause from the end result, but the supposed-plain-text
   > document has numerous instances of Apple Quotes in it, which obviously
   > aren't valid in either ASCII or any other non-MBCS encoding. This
   includes
   > many forms of [x]term, and even things like Firefox on Debian itself *in
   a
   > UTF8 locale*, because text/plain by definition does not permit binary
   data.

   As far as I can see, the bug is that the webserver advertize the text as
   windows-1252 instead of UTF-8 (as you can see using control-I under
   firefox.


It doesn't
dev@lal:~$ curl -I [2]https://www.debian.org/doc/debian-policy/policy.txt
HTTP/2 200
...
content-type: text/plain

I believe Content-type: text/plain is implicitly US-ASCII as per RFC 2046, although browsers recover from non-ASCII characters in an ostensibly ASCII file by attempting to guess ("sniff") a more suitable character set (and apparently Firefox guesses Windows-1252).

Ideally the www.debian.org web server should announce policy.txt and other UTF-8 text files as UTF-8. This is a web server configuration issue, not a Policy issue.

    smcv


Reply to: