[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: pre-approval for perl 5.10.0-16 changes



On Thu, Oct 02, 2008 at 04:23:19PM +0300, Niko Tyni wrote:

> Unfortunately a new encoding issue with the "pod2man --utf8"
> functionality has cropped up; see #500210. 

> I'd like to upload this:

Blah, scratch that. Further testing indicated it produced invalid UTF-8
in too many situations.

We actually do need a code change, but it only affects the "--utf8"
code path, which is not enabled by default, so it should be safe.
I'm bumping $Pod::Man::VERSION; the eval trick is for numifying the
string just like e.g. Module::Build does.

The debconf translated manual pages (which is where all of this pretty
much started) finally work with this.

There's also a new 5.10.0 crash regression (#501178) with a simple fix
from upstream that I'd like to get in.

I'm attaching a new full diff against 5.10.0-15 with debian/patches
duplication cleaned out.

Release folks, please ack once more and I'll upload.

Changes: 
 perl (5.10.0-16) unstable; urgency=low
 .
   * Revert the perldoc "pod2man --utf8" change from 5.10.0-14.
     The --utf8 option may break for POD documents with a wrong or missing
     =encoding. (Reopens: #492037)
   * Make Pod::Man use the PerlIO UTF-8 output layer when --utf8 is
     enabled. (See #500210)
   * Revert an incorrect substitution optimization introduced in 5.10.0.
     (Closes: #501178)

Thanks,
-- 
Niko Tyni   ntyni@debian.org
diff -u perl-5.10.0/pp_ctl.c perl-5.10.0/pp_ctl.c
--- perl-5.10.0/pp_ctl.c
+++ perl-5.10.0/pp_ctl.c
@@ -218,7 +218,6 @@
 	if (!(cx->sb_rxtainted & 2) && SvTAINTED(TOPs))
 	    cx->sb_rxtainted |= 2;
 	sv_catsv(dstr, POPs);
-	FREETMPS; /* Prevent excess tmp stack */
 
 	/* Are we done */
 	if (cx->sb_once || !CALLREGEXEC(rx, s, cx->sb_strend, orig,
diff -u perl-5.10.0/patches-applied perl-5.10.0/patches-applied
diff -u perl-5.10.0/pod/pod2man.PL perl-5.10.0/pod/pod2man.PL
--- perl-5.10.0/pod/pod2man.PL
+++ perl-5.10.0/pod/pod2man.PL
@@ -259,6 +259,12 @@
 supported by many implementations and may even result in segfaults and
 other bad behavior.
 
+Be aware that, when using this option, the input encoding of your POD
+source must be properly declared unless it is US-ASCII or Latin-1.  POD
+input without an C<=encoding> command will be assumed to be in Latin-1,
+and if it's actually in UTF-8, the output will be double-encoded.  See
+L<perlpod(1)> for more information on the C<=encoding> command.
+
 =item B<-v>, B<--verbose>
 
 Print out the name of each output file as it is being generated.
@@ -534,8 +540,8 @@
 
 =head1 SEE ALSO
 
-L<Pod::Man>, L<Pod::Simple>, L<man(1)>, L<nroff(1)>, L<podchecker(1)>,
-L<troff(1)>, L<man(7)>
+L<Pod::Man>, L<Pod::Simple>, L<man(1)>, L<nroff(1)>, L<perlpod(1)>,
+L<podchecker(1)>, L<troff(1)>, L<man(7)>
 
 The man page documenting the an macro set may be L<man(5)> instead of
 L<man(7)> on your system.
diff -u perl-5.10.0/lib/Pod/Man.pm perl-5.10.0/lib/Pod/Man.pm
--- perl-5.10.0/lib/Pod/Man.pm
+++ perl-5.10.0/lib/Pod/Man.pm
@@ -36,7 +36,9 @@
 
 @ISA = qw(Pod::Simple);
 
-$VERSION = '2.18';
+# Custom Debian version, see http://bugs.debian.org/500210
+$VERSION = '2.18_01';
+$VERSION = eval $VERSION;
 
 # Set the debugging level.  If someone has inserted a debug function into this
 # class already, use that.  Otherwise, use any Pod::Simple debug function
@@ -731,6 +733,19 @@
         return;
     }
 
+    # If we were given the utf8 option, set an output encoding on our file
+    # handle.  Wrap in an eval in case we're using a version of Perl too old
+    # to understand this.
+    #
+    # This is evil because it changes the global state of a file handle that
+    # we may not own.  However, we can't just blindly encode all output, since
+    # there may be a pre-applied output encoding (such as from PERL_UNICODE)
+    # and then we would double-encode.  This seems to be the least bad
+    # approach.
+    if ($$self{utf8}) {
+        eval { binmode ($$self{output_fh}, ':encoding(UTF-8)') };
+    }
+
     # Determine information for the preamble and then output it.
     my ($name, $section);
     if (defined $$self{name}) {
@@ -1592,6 +1607,12 @@
 by many implementations and may even result in segfaults and other bad
 behavior.
 
+Be aware that, when using this option, the input encoding of your POD
+source must be properly declared unless it is US-ASCII or Latin-1.  POD
+input without an C<=encoding> command will be assumed to be in Latin-1,
+and if it's actually in UTF-8, the output will be double-encoded.  See
+L<perlpod(1)> for more information on the C<=encoding> command.
+
 =back
 
 The standard Pod::Simple method parse_file() takes one argument naming the
@@ -1627,6 +1648,12 @@
 
 =head1 BUGS
 
+Encoding handling assumes that PerlIO is available and does not work
+properly if it isn't since encode and decode do not work well in
+combination with PerlIO encoding layers.  It's very unclear how to
+correctly handle this without PerlIO encoding layers.  The C<utf8> option
+is therefore not supported unless Perl is built with PerlIO support.
+
 There is currently no way to turn off the guesswork that tries to format
 unmarked text appropriately, and sometimes it isn't wanted (particularly
 when using POD to document something other than Perl).  Most of the work
@@ -1652,6 +1679,13 @@
 
 =head1 CAVEATS
 
+If Pod::Man is given the C<utf8> option, the encoding of its output file
+handle will be forced to UTF-8 if possible, overriding any existing
+encoding.  This will be done even if the file handle is not created by
+Pod::Man and was passed in from outside.  This seems to be the only way to
+consistently enforce UTF-8-encoded output regardless of PERL_UNICODE and
+other settings.
+
 The handling of hyphens and em dashes is somewhat fragile, and one may get
 the wrong one under some circumstances.  This should only matter for
 B<troff> output.
reverted:
--- perl-5.10.0/lib/Pod/Perldoc/ToMan.pm
+++ perl-5.10.0.orig/lib/Pod/Perldoc/ToMan.pm
@@ -60,10 +60,6 @@
       unless -e $pod2man;
   }
 
-  eval { require Pod::Man };
-  $switches .= " --utf8"
-    if (!$@ && $Pod::Man::VERSION >= 2.18);
-
   my $command = "$pod2man $switches --lax $file | $render -man";
          # no temp file, just a pipe!
 
diff -u perl-5.10.0/debian/changelog perl-5.10.0/debian/changelog
--- perl-5.10.0/debian/changelog
+++ perl-5.10.0/debian/changelog
@@ -1,3 +1,15 @@
+perl (5.10.0-16) unstable; urgency=low
+
+  * Revert the perldoc "pod2man --utf8" change from 5.10.0-14.
+    The --utf8 option may break for POD documents with a wrong or missing
+    =encoding. (Reopens: #492037)
+  * Make Pod::Man use the PerlIO UTF-8 output layer when --utf8 is
+    enabled. (See #500210)
+  * Revert an incorrect substitution optimization introduced in 5.10.0.
+    (Closes: #501178)
+
+ -- Niko Tyni <ntyni@debian.org>  Sun, 05 Oct 2008 16:00:41 +0300
+
 perl (5.10.0-15) unstable; urgency=low
 
   * Fix Sys::Syslog slowness when logging with non-native mechanisms.
reverted:
only in patch2:
unchanged:
only in patch2:
unchanged:

Reply to: