[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Why do we have a default activated man-db.timer?



Tom Bachreier, on 2019-02-11:
> With the migration of man-db 2.8.5-1 into testing about a month ago I
> get bothered once a day with a lot of useless lines in syslog:
> > Feb 10 00:00:51 osprey mandb[3543]: Purging old database entries in /usr/share/man...
> > Feb 10 00:00:51 osprey mandb[3543]: Processing manual pages under /usr/share/man...
> > [...]
[...]
> Thankfully Francois filed a bug earlier:
> <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=920628 <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=920628>>
>
> which got fixed upstream:
> <https://git.savannah.gnu.org/cgit/man-db.git/commit/?id=a4206c27060357cc78219a54349624e0d0675aff <https://git.savannah.gnu.org/cgit/man-db.git/commit/?id=a4206c27060357cc78219a54349624e0d0675aff>>
>
> So the messages will go away for me in a couple of days.
> Problem solved. :-)
>
> BUT wait...
> I was wondering why do we have man-db.timer in the first place?

Good Day Tom,

That is an interesting question with a story behind that deserve
to be told, definitely not obvious and probably easy to miss,
mostly due to the computer power we can afford these days.

Firstly let's have a look at the patch:

	https://git.savannah.gnu.org/cgit/man-db.git/commit/?id=a4206c27060357cc78219a54349624e0d0675aff

Modifications have been brought to the service descriptor of
mandb, to regenerate the database of manual pages in quiet mode.
On my Sid machine, this file is known under the name of
/usr/lib/systemd/system/man-db.service and has, among other
things, these lines:

	Type=oneshot
	# Recover from deletion, per FHS.
	ExecStart=+/usr/bin/install -d -o man -g man -m 0755 /var/cache/man
	# Expunge old catman pages which have not been read in a week.
	ExecStart=/usr/bin/find /var/cache/man -type f -name *.gz -atime +6 -delete
	# Regenerate man database.
	ExecStart=/usr/bin/mandb --quiet

Seems there is not only a mandb re-indexing, but also some
cleanup is done apparently, given the `find [...] -delete`.

Manual pages have various mechanisms to speedup fetching and
showing documentation.  For one is this indexing, done by mandb,
gathering the multiple locations of manual pages, such as
/usr/share/man for the system of course, but also locations for
custom programs, such as /usr/local/man, and perhaps you might
wish to append some third parties too, with /opt/man, $HOME/man
or whatever.  Anyway, these indexes do speedup fetching manual
pages meta-information such as:

- where to find the original content of the documentation when
  running `man`:

	$ man man

- fetch the short description of commands when running `whatis`:

	$ whatis man
	man (1)              - an interface to the on-line reference manuals
	man (7)              - macros to format man pages

- search for commands by using a few key words when running
  `apropos`:

	$ apropos --and create or update manual page
	catman (8)           - create or update the pre-formatted manual pages
	mandb (8)            - create or update the manual page index caches

The other mechanism consists in caching the binary
representation of compiled manual pages.  The original file
containing the documentation, inside /usr/share/man, is
written in groff, or mandoc on BSD, anyway the thing shown
through the pager (which is just either `more` or `less`,
finally, or `most` sometimes) when running `man` is not a direct
interpretation of the plain text file, but an intermediate
binary representation mostly composed of text and terminal
escape sequences (chosen depending on the value of the TERM
variable).  The caching of this intermediate representation is
done through the use of `cat` pages, and their directory tree
/var/cache/man, mimicking the tree below /usr/share/man.  The
only differences between the two are, the directories "man*/"
named "cat*/" instead, and raw manual page representation
instead of plain text files.

You can try for yourself how `man` handles the research inside
your documentation by passing the debug option -d and browsing
the output debug.out at your convenience:

	$ man -d man 2> debug.out

I discovered for instance that I never use this cache because I
cap my terminal width to 65 characters using the variable
MANWIDTH, and the debug output shown me:

	Terminal width 65 not within cat page range [80, 80]

Since lots of people do not use 80 char width terminals anymore,
mostly because of "maximize window", I get this function is not
of much use lately, at least not by using Debian default value.
That thing can be configured in /etc/manpath.config if you are
interested in such manipulations.

Well, I don't believe this catman mechanism is of much use
today, most manual pages are rendered in the blink of an eye.
Last time I had to /wait/ more than a seconds for a manual page
to be formatted, that was on a Sun Fire V440.


To summarize, the patched service does a clear the cache of
unused raw man pages, which should not be referenced anymore by
mandb.  Part of all this is more or less described in the manual
pages man(1), most particularly in sections "OVERVIEW" and
"DEFAULTS", which have a succinct descriptions of the various
mechanisms involved.  See also catman(8) and mandb(8).

Kind Regards,
-- 
Étienne Mollier <etienne.mollier@mailoo.org>

I might have missed some important or obvious points, I'm well
overdue for my sleeping hours...



Reply to: