[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#175064: DocBook XML conversion is read with this script



Hi!

On Sat, 2017-01-14 at 10:59:45 -0800, Russ Allbery wrote:
> Guillem Jover <guillem@debian.org> writes:
> > I've prepared a renewal of the conversion. And scripted it so that it
> > can be performed at any point in time regardless of most changes in the
> > sources.
> 
> > This also includes several fixes to the current SGML source to ease the
> > future conversion, I think these are fine to merge now already.
> 
> > There are still some things to polish and review I'm afraid. I think
> > there is a problem with spacing which get mangled on conversion, but I'd
> > need to recheck. The comments are currently lost. The PS and PDF
> > generation might also need some work. I think all IDs are preserved, but
> > this also needs checking.
> 
> > The current state can be tracked in the pu/markup-singularity branch at
> > <https://git.hadrons.org/cgit/debian/policy.git/>, which I might rebase
> > at any point in time.
> 
> Awesome, thank you so much!

No problem!

> Are any of the sub-policies ready to convert to DocBook right now?  We
> could convert them for the next release and worry about the main Policy
> document, which presumably would be harder, in a later release.

I've found the problem with the wrong spacing, which was due to
tidy(1), I've played now with xmllint(1) and pandoc(1), but disabled
the initial cleanup for now (branch updated). So the converted XML is
not indented, but I'm not sure if you are fine with that.

I'm including a patchset which fixes several things that will make the
conversion easier, and I think they are correct independently of the
conversion.

The remaining possible output issues/differences are:

  * The Abstract and Copyright Notice end up w/o any heading, so it's
    a bit hard to distinguish.
  * The authors are listed at the top of the documents instead of at
    the bottom.
  * The policy version and date are not output.
  * The upgrading-checklist.xml output generates a TOC, the new
    html-notoc.dsl needs to be hooked into the build machinery to
    avoid that.
  * The PDF/PS output for policy.xml probably needs some tuning.
  * The build dependencies might need checking for additions or
    removals.
  * The XML is not shipped for some of the converted documents, I'm
    not sure why the SGML was being shipped before?
  * … (Probably some other stuff I might be forgetting now.)

OTOH, the output seems less cluttered which looks like an improvement
to me.

Thanks,
Guillem
From 2683ceca9e851dc8f7df964d8fa67408ad157466 Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Tue, 10 Jan 2017 00:31:37 +0100
Subject: [PATCH 1/7] Use entities instead of literal <, > and &

This is required in DocBook, otherwise it makes XML toolchains trip over.
---
 policy.sgml | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/policy.sgml b/policy.sgml
index b0b2e09..9810090 100644
--- a/policy.sgml
+++ b/policy.sgml
@@ -4274,7 +4274,7 @@ Checksums-Sha256:
 	        facility the <prgn>postrm</prgn> intends to call is
 	        available before calling it.  For example:
 <example>
-if [ "$1" = purge ] && [ -e /usr/share/debconf/confmodule ]; then
+if [ "$1" = purge ] &amp;&amp; [ -e /usr/share/debconf/confmodule ]; then
         . /usr/share/debconf/confmodule
         db_purge
 fi
@@ -8593,7 +8593,7 @@ exec /usr/lib/foo/foo "$@"
             <tt>upstart</tt> and avoid running in favor of the native
             upstart job, using a test such as this:
 	    <example compact="compact">
-if [ "$1" = start ] && which initctl >/dev/null && initctl version | grep -q upstart
+if [ "$1" = start ] &amp;&amp; which initctl >/dev/null &amp;&amp; initctl version | grep -q upstart
 then
 	exit 1
 fi
@@ -9712,7 +9712,7 @@ ln -fs ../sbin/sendmail debian/tmp/usr/bin/runq
 for i in /usr/bin/foo /usr/sbin/bar
 do
   # only do something when no setting exists
-  if ! dpkg-statoverride --list $i >/dev/null 2>&1
+  if ! dpkg-statoverride --list $i &gt;/dev/null 2&gt;&amp;1
   then
     #include: debconf processing, question about foo and bar
     if [ "$RET" = "true" ] ; then
@@ -9726,7 +9726,7 @@ done
 	    <example>
 for i in /usr/bin/foo /usr/sbin/bar
 do
-  if dpkg-statoverride --list $i >/dev/null 2>&1
+  if dpkg-statoverride --list $i >/dev/null 2&gt;&amp;1
   then
     dpkg-statoverride --remove $i
   fi
@@ -12341,7 +12341,7 @@ END-INFO-DIR-ENTRY
 	older version (unless the older version is so old that direct
 	upgrades are no longer supported):
 	<example>
-  if [ abort-upgrade = "$1" ] && dpkg --compare-versions "$2" lt 1.0-2; then
+  if [ abort-upgrade = "$1" ] &amp;&amp; dpkg --compare-versions "$2" lt 1.0-2; then
      dpkg-divert --package smailwrapper --remove --rename \
         --divert /usr/sbin/smail.real /usr/sbin/smail
   fi
-- 
2.11.0

From 3bd9ff501328e1c5e88721ef86164ad7415c9327 Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Tue, 10 Jan 2017 00:40:50 +0100
Subject: [PATCH 2/7] Use <var> instead of angle bracket entities

This gets rid of these entities, which otherwise get lost in the DocBook
conversion, and also switches to use the more correct markup anyway.
---
 policy.sgml | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/policy.sgml b/policy.sgml
index 9810090..afa05a6 100644
--- a/policy.sgml
+++ b/policy.sgml
@@ -3391,8 +3391,8 @@ Package: libc6
 
 	  <p>
 <example>
-	Description: &lt;single line synopsis&gt;
-	 &lt;extended description over several lines&gt;
+	Description: <var>single line synopsis</var>
+	 <var>extended description over several lines</var>
 </example>
 	  </p>
 
@@ -7106,10 +7106,10 @@ Built-Using: grub2 (= 1.99-9), loadlin (= 1.6e-1)
 	      </item>
 	      <item>
 		<p>
-                  The requirement for <file>/usr/local/lib&lt;qual&gt;</file>
-                  to exist if <file>/lib&lt;qual&gt;</file> or
-                  <file>/usr/lib&lt;qual&gt;</file> exists (where 
-                  <file>lib&lt;qual&gt;</file> is a variant of
+                  The requirement for <file>/usr/local/lib<var>qual</var></file>
+                  to exist if <file>/lib<var>qual</var></file> or
+                  <file>/usr/lib<var>qual</var></file> exists (where
+                  <file>lib<var>qual</var></file> is a variant of
                   <file>lib</file> such as <file>lib32</file> or
                   <file>lib64</file>) is removed.
                   </p>
@@ -7802,11 +7802,11 @@ test -f <var>program-executed-later-in-script</var> || exit 0
 	    <p>
 	      Most packages will simply need to change:
 	      <example compact="compact">
-/etc/init.d/&lt;package&gt; &lt;action&gt;
+/etc/init.d/<var>package</var> <var>action</var>
 	      </example> in their <prgn>postinst</prgn>
 	      and <prgn>prerm</prgn> scripts to:
 	      <example compact="compact">
-invoke-rc.d <var>package</var> &lt;action&gt;
+invoke-rc.d <var>package</var> <var>action</var>
 	      </example>
 	    </p>
 
@@ -9962,7 +9962,7 @@ http://localhost/cgi-bin/.../<var>cgi-bin-name</var>
                 may be referred to through an alias <tt>/images/</tt>
                 as
                 <example>
-                  http://localhost/images/&lt;package&gt;/&lt;filename&gt;     
+                  http://localhost/images/<var>package</var>/<var>filename</var>
                 </example>
                 
               </p>
-- 
2.11.0

From a23b7bd59bba5269e2a00dbbf50e519ca0373651 Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Tue, 10 Jan 2017 01:05:30 +0100
Subject: [PATCH 3/7] Use <var> instead of <em>

<emphasis> is not allowed within <literal> in DocBook, which is what
this gets converted to. Instead use <var> which is allowed, and is also
a more correct markup anyway.
---
 policy.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/policy.sgml b/policy.sgml
index afa05a6..084472c 100644
--- a/policy.sgml
+++ b/policy.sgml
@@ -1920,8 +1920,8 @@ zope.
 	  It must start with the line <tt>#!/usr/bin/make -f</tt>,
 	  so that it can be invoked by saying its name rather than
 	  invoking <prgn>make</prgn> explicitly. That is, invoking
-          either of <tt>make -f debian/rules <em>args...</em></tt>
-          or <tt>./debian/rules <em>args...</em></tt> must result in
+          either of <tt>make -f debian/rules <var>args...</var></tt>
+          or <tt>./debian/rules <var>args...</var></tt> must result in
           identical behavior.
 	</p>
 
-- 
2.11.0

From 4b4e18a38f4fbc88bc68601d179279f2a4f6b13f Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Tue, 10 Jan 2017 00:45:38 +0100
Subject: [PATCH 4/7] Do not use slashes in section ID attributes

These are not valid in DocBook, and we should avoid using an ID that
will need to be changed later on, so that the ID can be preserved.
---
 policy.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/policy.sgml b/policy.sgml
index 084472c..fb1f30f 100644
--- a/policy.sgml
+++ b/policy.sgml
@@ -7438,7 +7438,7 @@ rmdir /usr/local/share/emacs 2>/dev/null || true
       <sect id="sysvinit">
 	<heading>System run levels and <file>init.d</file> scripts</heading>
 
-	<sect1 id="/etc/init.d">
+	<sect1 id="etc-init.d">
 	  <heading>Introduction</heading>
 
 	  <p>
@@ -7833,7 +7833,7 @@ invoke-rc.d <var>package</var> <var>action</var>
             which contained scripts which were run once per machine
             boot. This has been deprecated in favour of links from
             <file>/etc/rcS.d</file> to files in <file>/etc/init.d</file> as
-            described in <ref id="/etc/init.d">.  Packages must not
+            described in <ref id="etc-init.d">.  Packages must not
             place files in <file>/etc/rc.boot</file>.
 	  </p>
 	</sect1>
-- 
2.11.0

From 22a8ff8fcbe002a0d76cd21a1aadf52dc998cc22 Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Sun, 15 Jan 2017 19:15:19 +0100
Subject: [PATCH 5/7] Turn an SGML comment in menu-policy into an actual
 paragraph

This comment describes how the document is supposed to be maintained. So
it seems relevant in the About chapter.

This also makes it easier to convert to DocBook, as otherwise the comment
gets lost, and it cannot be mangled as it is located in a place where a
<para> is not valid.
---
 menu-policy.sgml | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/menu-policy.sgml b/menu-policy.sgml
index c919740..2dbfee5 100644
--- a/menu-policy.sgml
+++ b/menu-policy.sgml
@@ -5,12 +5,6 @@
 ]>
 <debiandoc>
 
-  <!--
-  The debian-policy mailing list has taken responsibility for the
-  contents of this document, with the package maintainers responsible
-  for packaging administrivia only.
-  -->
-
   <book>
     <titlepag>
       <title>The Debian Menu sub-policy</title>
@@ -95,6 +89,13 @@
 	  </item>
 	</enumlist>
       </p>
+
+      <p>
+        The <url id="mailto:debian-policy@lists.debian.org";
+        name="debian-policy mailing list"> has taken responsibility for
+        the contents of this document, with the <em>Menu</em> package
+        maintainer's responsible for packaging administrivia only.
+      </p>
     </chapt>
 
     <chapt>
-- 
2.11.0

From ab69ba27d5767c93fa4c2ad5cf18008841810de4 Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Sun, 15 Jan 2017 20:21:10 +0100
Subject: [PATCH 6/7] Replace SGML comment with an actual reference to the
 policy process

This gets rid of a comment, so we do not have to bother with restoring
it when converting to DocBook.
---
 policy.sgml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/policy.sgml b/policy.sgml
index fb1f30f..0fb3067 100644
--- a/policy.sgml
+++ b/policy.sgml
@@ -222,7 +222,8 @@
 	  id="mailto:debian-policy@lists.debian.org";>. Proposals
           are discussed there and inserted into policy after a certain
           consensus is established.
-          <!-- insert shameless policy-process plug here eventually -->
+          The current policy process is described in the <url name="Process"
+          id="Process.md"> document.
           The actual editing is done by a group of maintainers that have
           no editorial powers. These are the current maintainers:
 
-- 
2.11.0

From 3dc5a149a2edf5fd5600c8cf8b443c6587d5f1b8 Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Sun, 15 Jan 2017 20:29:36 +0100
Subject: [PATCH 7/7] Remove outdated SGML comments

These have not been true for a long time, just get rid of these
comments. Also making easier to conversion to DocBook.
---
 policy.sgml | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/policy.sgml b/policy.sgml
index 0fb3067..1b86b42 100644
--- a/policy.sgml
+++ b/policy.sgml
@@ -2375,7 +2375,6 @@ endif
 	</sect1>
       </sect>
 
-<!-- FIXME: section pkg-srcsubstvars is the same as substvars -->
       <sect id="substvars">
 	<heading>Variable substitutions: <file>debian/substvars</file></heading>
 
@@ -4488,12 +4487,6 @@ fi
 		It is an error for a package to contain files which
 		are on the system in another package, unless
 		<tt>Replaces</tt> is used (see <ref id="replaces">).
-		<!--
-		The following paragraph is not currently the case:
-		Currently the <tt>- - force-overwrite</tt> flag is
-		enabled, downgrading it to a warning, but this may not
-		always be the case.
-		-->
 	      </p>
 
 	      <p>
-- 
2.11.0


Reply to: