Bug#99933: Refined proposal
Here is the proposal with typos and mistakes fixed, with added
paragraph about possible use of other encodings. I left out
the requirenment to specify a font needed to view the
documentation, since IMHO that is overcomplication and
--- policy.sgml-old Fri Jun 1 11:40:16 2001
+++ policy.sgml Thu Jun 7 13:31:09 2001
@@ -1653,6 +1653,15 @@
+ <sect id="controlencoding"><heading>Encoding of control files</heading>
+ If, for whatever reason (such as upstream author's or maintainer's
+ names, foreign language package description and similar), you need to
+ use characters outside 7 bit ASCII range in control files, these
+ characters should be encoded using UTF-8 encoding.
<chapt id="versions"><heading>Version numbering</heading>
@@ -2276,8 +2285,16 @@
+ <sect1><heading>Character set of <tt>debian/changelog</tt></heading>
+ Character set of <tt>debian/changelog</tt> should be either pure ASCII, or UTF-8.
and variable substitutions </heading>
@@ -7370,6 +7387,26 @@
+ Documentation of debian packages in text format, if written in
+ language requiring characters outside of 7-bit ASCII range,
+ should use either well-established encoding for the given
+ language <footnote>such as ISO-8859-2 for some central- and eastern
+ europian languages, KOI8-R for Russian, etc.</footnote>, or UTF-8
+ Maintainers are being encouraged to use UTF-8, having in mind
+ the general debian migration toward unified character encoding.
+ Original upstream documentation, if in encoding other than UTF-8
+ or the well-established encoding for the particular language,
+ should be converted either to UTF-8 or to the well-established
+ encoding. Choice between UTF-8 and other encoding is left to the
+ maintainer's discretion, however, in a single package, all the
+ documents written in a particular language should share the same encoding.
+ Package may (at the discretion of the maintainer) include documentation
+ files in other encodings, if they are present also in canonical encoding,
+ and if the encodings used are clearly marked.
@@ -7440,6 +7477,18 @@
Other formats such as PostScript may be provided at the
package maintainer's discretion.
+ HTML documents, if in encoding other than <tt>us-ascii</tt>, should
+ have in their header an appropriate META tag describing
+ the used encoding.
+ <META HTTP-Equiv="Content-Type" CONTENT="text/html; charset=UTF-8">
@@ -7555,6 +7604,24 @@
changelog, then the Debian changelog should still be called
+ <sect id="charset">
+ <heading>Deafult character set</heading>
+ Names of maintainers, upstream authors and other data in
+ packages' descriptions and related debian data files (such as
+ <tt>debian/changelog</tt>, <tt>debian/copyright</tt>,
+ <tt>debian/control</tt>), as well as in English language
+ documentation, should be either transliterated or
+ transcribed to ASCII, or used in UTF-8 encoding at the
+ discretion of the maintainer. However, for names
+ in scripts based on non-latin alphabets, ASCII (or suitable
+ latin-script) version should be provided along with original