[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Reasons to not use quote signs directly?



[ Russ (CCed), please see below for some inquiries about pod2man. ]

Hi!

On Mon, 2016-09-19 at 18:30:49 +0200, Helge Kreutzmann wrote:
> the dpkg man pages were converted during the recent months from direct
> quote signs to groff marcros for the quote signs.

Right, this was done for multiple reasons, at least:

 * To unify and clarify the formatting.
 * To get nice output characters (if available) when rendering
   (‘’, “”, «», etc).
 * To get rid of the ugly `' pairs.

> When we discussed this on debian-l10n-german, we wondered why you use
> the macros like \\(Fo and not simply the unicode character which it
> produces? In the processed output it does not matter, in the source
> code it is much easier to read and translate e.g.
> 
> or what «Fodate -R» generates
> than
> or what \\(Fodate -R\\(Fc generates

Using raw UTF-8 in the roff source is not portable, and some (most?)
implementations might not be happy about that. But using the escape
sequences should always be safe(?). (I've just verified at least on
AIX and Mac OS X systems.)

But coming back to the source code, yes, I pretty much agree that roff
can be very noisy and non-readable, to the point I've actually gotten
bothered enough to check for possible alternatives this last month. The
problem is finding a format that is clear, expressive enough, supported
by po4a, does not require huge Build-Depends and produces portable and
nicely formatted man pages. The obvious candidate is perl's POD, because
we are already using that for the perl modules and require perl to build.

But I've found some quirks and issues that while not unsurmountable,
might need to be looked at first and perhaps fixed or workarounds found
to avoid "regressions", and I'm not sure which ones Russ would be happy
to get bug reports for? :) I'm attaching a PoC conversion (can be tested
with «pod2man deb-symbols.pod|man -l -», and is available also from [G])
and here's a list of potential differences/issues:

  - References are in italic not bold.
  - Does not map ‘’, “”, and other UTF-8 quotes to roff escape sequences
    (or have to use non-portable --utf8 option).
  - Needs raw roff for some formatting, as POD is not expressive enough
    (this will have to do with «=begin man» as pod2man cannot change
    the POD syntax anyway).
  - Many minus signs are output as hyphens (for example for field names).
  - Default for pod2man is no justified text.
  - The license blurb is only present as a comment on the source.

I should probably try converting a more complex man page to see if there
are other issues. But on the plus side, the source is way way more
readable, and as a side-effect it would also fix the problem with
out-dated version and date in man pages. :)

[G] <https://git.hadrons.org/cgit/debian/dpkg/dpkg.git/log/?h=pu/man-switch-to-pod-trial>

> (I know, I changed that myself because for some reason po4a did not
> like the first part which looks like a bug in po4a or some broken
> encoding somewhere).

Hmm, probably using -M UTF-8 in the po4a.cfg would fix this, but as
stated above, that would probably be a bad idea amyway.

> Btw. the German man page project uses (and relies on) UTF-8 for many
> years already.

Right, I don't mind the translated man pages using raw UTF-8 text, as
otherwise we'd need to use escapes also for accented letters which
would be even more cumbersome. :/ As long as the users on “lesser”
systems can use the English man pages I'm happy enough, though.

> As the outcome of this discussion I will update the quotes in the
> German text to the correct ones, either with groff macros or with
> direct input.

For now, and for translated man pages I'd probably just use whatever
UTF-8 text you think is appropriate, but take into account those will
not be usable on systems w/o UTF-8 support, which TBH we can probably
ignore for this purpose.

Thanks,
Guillem
# dpkg manual page - deb-symbols(5)
#
# Copyright © 2007-2012 Raphaël Hertzog <hertzog@debian.org>
# Copyright © 2011, 2013-2015 Guillem Jover <guillem@debian.org>
#
# This is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <https://www.gnu.org/licenses/>.

=encoding utf8

=head1 NAME

deb-symbols - Debian's extended shared library information file

=head1 SYNOPSIS

symbols

=head1 DESCRIPTION

The symbol files are shipped in Debian binary packages, and its format
is a subset of the template symbol files used by L<dpkg-gensymbols(1)>
in Debian source packages.

The format for an extended shared library dependency information entry
in these files is:

=begin man

.PP
.nf
.I library-soname main-dependency-template
[| \fIalternative-dependency-template\fP]
[...]
[* \fIfield-name\fP: \fIfield-value\fP]
[...]
 \fIsymbol\fP \fIminimal-version\fP [\fIid-of-dependency-template\fP]
.fi

=end man

The I<library-soname> is exactly the value of the SONAME field as exported
by L<objdump(1)>. A I<dependency-template> is a dependency where I<#MINVER#>
is dynamically replaced either by a version check like
“(>= I<minimal-version>)” or by nothing (if an unversioned dependency is
deemed sufficient).

Each exported I<symbol> (listed as I<name>@I<version>, with I<version>
being “Base” if the library is not versioned) is associated to a
I<minimal-version> of its dependency template (the main dependency
template is always used and will end up being combined with the dependency
template referenced by I<id-of-dependency-template> if present). The
first alternative dependency template is numbered 1, the second one 2,
etc.

Each entry for a library can also have some fields of meta-information.
Those fields are stored on lines starting with an asterisk. Currently,
the only valid fields are:

=over 4

=item B<Build-Depends-Package>

It indicates the name of the "-dev" package associated to the library
and is used by B<dpkg-shlibdeps> to make sure that the dependency
generated is at least as strict as the corresponding build dependency
(since dpkg 1.14.13).

=item B<Ignore-Blacklist-Groups>

It indicates what blacklist groups should be ignored, as a whitespace
separated list, so that the symbols contained in those groups get
included in the output file (since dpkg 1.17.6). This should only be
necessary for toolchain packages providing those blacklisted symbols.
The available groups are system dependent, for ELF and GNU-based
systems these are B<aeabi> and B<gomp>.

=back

=head1 EXAMPLES

=head2 Simple symbols file

  libftp.so.3 libftp3 #MINVER#
   DefaultNetbuf@Base 3.1-1-6
   FtpAccess@Base 3.1-1-6
   [...]

=head2 Advanced symbols file

  libGL.so.1 libgl1
  | libgl1-mesa-glx #MINVER#
  * Build-Depends-Package: libgl1-mesa-dev
   publicGlSymbol@Base 6.3-1
   [...]
   implementationSpecificSymbol@Base 6.5.2-7 1
   [...]

=head1 SEE ALSO

L<https://wiki.debian.org/Projects/ImprovedDpkgShlibdeps>,
L<dpkg-shlibdeps(1)>,
L<dpkg-gensymbols(1)>.

Reply to: