[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#440420: marked as done ([AMENDMENT 11/02/2008] Manual page encoding)



Your message dated Wed, 04 Jun 2008 23:32:03 +0000
with message-id <E1K42SZ-00060N-Ds@ries.debian.org>
and subject line Bug#440420: fixed in debian-policy 3.8.0.0
has caused the Debian Bug report #440420,
regarding [AMENDMENT 11/02/2008] Manual page encoding
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
440420: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=440420
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: debian-policy
Severity: wishlist

[CCs: debian-i18n and debian-doc for obvious reasons, and the debhelper
maintainer since there's a dh_installman change mentioned in the
transition plan further down.]

Recently I have encountered some confusion as to the proper encoding of
manual pages (which is entirely understandable given that this subsystem
is lagging somewhat behind the rest of the world in terms of UTF-8
support). As the man-db maintainer, I would like to clarify this in
policy.

Note that, while there are one or two instances of deviation which
prompted this proposal, this documents current practice in that it is
what has been implemented in man-db for some time and it is already
followed by the vast majority of packages. I don't believe that I'm
making large swathes of packages instantly buggy here; if they did not
follow this policy, they would already be buggy in that pages would be
displayed with visible encoding damage. Accordingly, I've tentatively
used a "must" for the encoding rules. I'm prepared to back off to a
"should" if consensus on the list is against me here.

I have used the language "not yet recommended" regarding installation of
UTF-8 manual pages. My intent here was not so much to normatively state
that this is a bug as to discourage it for the time being. As I noted in
a footnote, I do expect this to be supported properly in man-db 2.5.0,
which I've been working on for a while now (and in earnest for about the
last week).

I thus propose the following amendment, generated against
debian-policy@lists.debian.org--lenny/debian-policy--devel--3.7--base-0.
I am seeking comments on and seconds for this proposal.

--- orig/policy.sgml
+++ mod/policy.sgml
@@ -8450,6 +8450,39 @@
 	      be present in the future.
  	  </footnote>
  	</p>
+
+	<p>
+	  Manual pages that are installed under
+	  <file>/usr/share/man/</file><var>ll</var>, where <var>ll</var>
+	  is an ISO-639 language code, must be encoded with the usual
+	  legacy (non-UTF-8) character set for that language, as shown
+	  by:
+	  <example compact="compact">
+egrep -v '\.|@|UTF-8' /usr/share/i18n/SUPPORTED
+	  </example>
+	  <footnote>
+	    This is necessary because many packages have historically
+	    included manual pages encoded thus, and changing the
+	    encoding of the whole hierarchy would involve a difficult
+	    transitional period.
+	  </footnote>
+	  Manual pages that are installed under
+	  <file>/usr/share/man/</file><var>locale</var>, where
+	  <var>locale</var> is a full locale name listed in
+	  <file>/usr/share/i18n/SUPPORTED</file>, must be encoded with
+	  the character set implied by that locale.
+	</p>
+
+	<p>
+	  At present, it is not generally possible to install a manual
+	  page encoded in UTF-8 such that it will be used in all locales
+	  for that language (for example, a page installed under
+	  <file>/usr/share/man/fr_FR.UTF-8</file> will not be used in
+	  the <tt>fr_BE.UTF-8</tt> locale). It is therefore not yet
+	  recommended to install pages encoded in UTF-8, but rather to
+	  continue using the legacy encoding.<footnote>This is expected
+	  to change as of man-db 2.5.0.</footnote>
+	</p>
       </sect>
 
       <sect>


It will perhaps be helpful if I describe my transition plan for getting
manual pages into UTF-8. Contrary to what occasionally seems to be
popular belief, a newer version of groff is not necessary here (which is
just as well as repeated attempts to merge in the CJK patch have been
exceedingly painful, though I still hold out hope to get it done
eventually). man-db is capable of shoving in iconv pipes as necessary.

  1. Status at time of writing: packages should use only
     /usr/share/man/<ll>/ (although some packages have anticipated an
     approximation of the transition plan; we ignore these for the
     moment as there is little point in changing them only to change
     them back later), and must use the legacy encoding for pages
     installed there.

  2. man-db 2.5.0-1 uploaded, including support for installing pages in
     /usr/share/man/<ll>.<codeset>/ (e.g. /usr/share/man/fr.UTF-8). The
     basename of this directory is not typically a well-formed locale,
     but it is appropriate because it allows a clear specification of
     the hierarchy's encoding while applying to all countries using that
     language.

  3. man-db 2.5.0-1 moves into testing.

  4. Packages encouraged (via debian-devel-announce) to begin using
     /usr/share/man/<ll>.UTF-8/; installation in other hierarchies will
     not be necessary as man-db will recode as needed. Packages using
     these hierarchies will be encouraged to declare Conflicts: man-db
     (<< 2.5.0-1) (or will Breaks: be allowed by that point? is either
     one just overkill?).

  5. Update dh_installman to recode manual pages to UTF-8 automatically
     and install them under /usr/share/man/<ll>.UTF-8/. Getting the
     Conflicts:/Breaks: in here might be difficult, plus I'm not sure
     I'm wild about creating several thousand more arcs in our
     dependency graph. Maybe it's better just to wait for a stable
     release before changing debhelper, and not worry too much about the
     Conflicts:/Breaks: as it's not like the whole system will break as
     a result.

  6. Policy updated once this has been shaken down and confirmed to work
     properly.

  7. Distant future: deprecate /usr/share/man/<ll>/. This will only be
     for consistency, so there's no need to rush.

This shouldn't be too difficult from where I am now, and at the moment I
see no obstacles to landing UTF-8 manual page support for lenny. Note
that the implementation using iconv will mean that any characters used
that are not recodable to the corresponding legacy encoding will be
discarded; this is difficult to avoid without upgrading groff, but I
don't anticipate it being a substantial problem. Likewise, we'll
probably still be unable to handle Arabic and Indic scripts properly,
and CJK will probably still be a massive hack; but it'll be an
improvement.

Thanks,

-- 
Colin Watson                                       [cjwatson@debian.org]

Attachment: signature.asc
Description: Digital signature


--- End Message ---
--- Begin Message ---
Source: debian-policy
Source-Version: 3.8.0.0

We believe that the bug you reported is fixed in the latest version of
debian-policy, which is due to be installed in the Debian FTP archive:

debian-policy_3.8.0.0.dsc
  to pool/main/d/debian-policy/debian-policy_3.8.0.0.dsc
debian-policy_3.8.0.0.tar.gz
  to pool/main/d/debian-policy/debian-policy_3.8.0.0.tar.gz
debian-policy_3.8.0.0_all.deb
  to pool/main/d/debian-policy/debian-policy_3.8.0.0_all.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 440420@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Russ Allbery <rra@debian.org> (supplier of updated debian-policy package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.8
Date: Wed, 04 Jun 2008 15:53:27 -0700
Source: debian-policy
Binary: debian-policy
Architecture: source all
Version: 3.8.0.0
Distribution: unstable
Urgency: low
Maintainer: Debian Policy List <debian-policy@lists.debian.org>
Changed-By: Russ Allbery <rra@debian.org>
Description: 
 debian-policy - Debian Policy Manual and related documents
Closes: 65577 186700 209008 250202 291460 367984 379150 392362 403391 422552 430649 431813 440420 442070 452105 455602 458910 473761 475731 480551 481640 481954
Changes: 
 debian-policy (3.8.0.0) unstable; urgency=low
 .
   * Bug fix: "[PROPOSAL] "debian/README.source" file for packages with
     non-trivial source", thanks to Wouter Verhelst, Jörg Sommer, Colin Watson,
     and Junichi Uekawa                                       (Closes: #250202).
   * Bug fix: "[AMENDMENT 11/02/2008] Manual page encoding", thanks to
     Colin Watson                                             (Closes: #440420).
   * Bug fix: "[PROPOSAL] common interface for parallel building in
     DEB_BUILD_OPTIONS", thanks to Loïc Minier, Peter Samuelson, and Robert
     Millan                                                   (Closes: #209008).
   * Bug fix: "Please clarify splitting/syntax of DEB_BUILD_OPTIONS", thanks to
     Loïc Minier, Peter Samuelson, Robert Millan, and Guillem Jover
                                                              (Closes: #430649).
   * Bug fix: "Documentation for Breaks in dpkg", thanks to Ian Jackson
                                                              (Closes: #379150).
   * Bug fix: "support for wrapped Uploaders should now be mandatory"
                                                              (Closes: #431813).
   * Bug fix: "[PROPOSAL] Add should not embed code from other packages",
     thanks to Neil McGovern, Colin Watson, Bill Allombert, Steve Langasek,
     Kurt Roeckx, and others                                  (Closes: #392362).
   * Bug fix: "Homepage field in debian/control undocumented", thanks to
     Mario Iseli                                              (Closes: #452105).
   * Bug fix: "Policy inconsistent with reality: base subsection no longer
     used", thanks to Magnus Holmgren, Bernd Zeimetz, and Colin Watson
                                                              (Closes: #442070).
   * Bug fix: "Inclusion of Apache Software License versions in
     /usr/share/common-licenses", thanks to Barry Hawkins     (Closes: #291460).
   * Bug fix: "[Amended] copyright should include notice if a package is
     not a part of Debian distribution", thanks to Taketoshi Sano
                                                              (Closes: #65577).
   * Bug fix: "scripts as configuration files: should vs. must", thanks to Frank
     Küster                                                   (Closes: #403391).
   * Bug fix: "debconf specification should allow underscores in template
     names", thanks to Colin Watson                           (Closes: #473761).
   * Bug fix: "clarify handling of run-time and compile-time support programs",
     thanks to Goswin Brederlow and Raphael Hertzog           (Closes: #367984).
   * Policy: better document version ranking and empty Debian revisions
     Wording: Russ Allbery <rra@debian.org>
     Seconded: Raphaël Hertzog <hertzog@debian.org>
     Seconded: Manoj Srivastava <srivasta@debian.org>
     Seconded: Guillem Jover <guillem@debian.org>
     Closes: #186700, #458910
   * Policy: remove obsolete app-defaults and Xresources provisions
     Wording: Julien Cristau <jcristau@debian.org>
     Seconded: Russ Allbery <rra@debian.org>
     Closes: #480551
   * Bug fix: "Examples of dpkg frontends should mention apt now", thanks
     to Josh Triplett                                         (Closes: #455602).
   * Bug fix: "Minor typos and wording suggestions", thanks to Michael
     Tautschnig                                               (Closes: #422552).
   * Bug fix: "substvar reference moved from dpkg-source(1) to
     deb-substvars(5)", thanks to Ian Beckwith                (Closes: #475731).
   * Policy: bugs fixed in NMUs are now closed rather than marked fixed
     Wording: Russ Allbery <rra@debian.org> (thanks, Sandro Tosi)
     Closes: #481640
   * Policy: C.1.4, C.1.8: minor typos
     Wording: Sandro Tosi <matrixhasu@gmail.com>
     Closes: #481954
   * Remove the now-obsolete policy-process document.
   * Add an md5sums control file.
   * Add Vcs-Browser and Vcs-Git control fields.
   * Remove build system support for FHS 2.1 and FSSTND, mostly commented out.
   * Remove more temporary files created by the build.
   * Remove the FSSTND license from debian/copyright; no FSSTND files are
     currently part of policy.
   * Update FHS copyright dates in debian/copyright.
   * Standardize the spacing around headings in upgrading-checklist.html.
   * Remove old ChangeLog files and metadata headers in maintainer scripts
     and debian/rules.
Checksums-Sha1: 
 f42b9921908670eb41c04940875084bc07750592 1095 debian-policy_3.8.0.0.dsc
 3eda45d7ca5563bab8bfda93286137071979385c 638655 debian-policy_3.8.0.0.tar.gz
 73680c98bc62507858aa055bcf1f1688a812f5ba 1588552 debian-policy_3.8.0.0_all.deb
Checksums-Sha256: 
 507a048bc7c84039910843e284d8e0e305778224346fd981c6f749176cc79220 1095 debian-policy_3.8.0.0.dsc
 8321b1dddd3ddd55a09539c842084ea05a731265c4c5847997957a552ba1aaa4 638655 debian-policy_3.8.0.0.tar.gz
 6c2083f50ccaa5a2f2d7a89febd320cf3a862b3204157324ffd9b363daac3e58 1588552 debian-policy_3.8.0.0_all.deb
Files: 
 37ff33fb3ccebc4f87e23fd7b91e7859 1095 doc optional debian-policy_3.8.0.0.dsc
 2565d6eaceac0aa2d093538048c1b8ed 638655 doc optional debian-policy_3.8.0.0.tar.gz
 3b153faeec899cdf1199d4d46c5d8859 1588552 doc optional debian-policy_3.8.0.0_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIRyNB+YXjQAr8dHYRAt4NAKDbO1f3BlmKT5SgMVf4AHE2Z7bPTgCffcnI
Kwa3jEGgq+PV6dwiurjmSAc=
=wCDz
-----END PGP SIGNATURE-----



--- End Message ---

Reply to: