Bug#865713: Please Start UTF-8 debian-policy Text Files with UTF-8 Signature
Paul Hardy <unifoundry@gmail.com> writes:
> That might not be the only UTF-8 that appears in such files someday
> though, so a more general solution would be to start the file with the
> UTF-8 signature, aka the Byte Order Mark (BOM). This is the UTF-8
> encoding of U+FEFF, which is 0xEF 0xBB 0xBF or octal \357 \273 \277.
> Then a web browser should display UTF-8 characters within the text file
> properly.
Hi Paul,
I don't believe it's correct to expect UTF-8 files to include this. I've
heard of BOM marks used this from the very early days of Unicode, but so
far as I understand it, the world has largely given up on this approach
and UTF-8 generators do not produce them. Debian is full of UTF-8 files
(copyright files, changelog files, etc.), and I don't believe we include
those BOM marks anywhere. I don't think it makes sense for Policy to go
to special effort to be unique in this regard.
You should just assume that all text files in Debian are UTF-8 unless they
are declared otherwise and configure browsers and other file readers
accordingly.
(Also, if you're viewing things in a web browser, just view the HTML
files. It will be a much better experience.)
--
Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>
Reply to: