man-db segfault: problem found (but need help for the solution :-)
After several nights of work and nightmares, I have casually found the
reason of the misterious segfault that affects lots of people, both in
bo and hamm.
The segfault is due to a "bug" in db1 library when it reads a corrupted
index file. db2 library doesn't segfault, but issue a warning and act as
if there is no index (not a solution, therefore).
I had really bad times trying to reproduce the corruption, and even a
corrupted index sent me by Joey Hess (thanx) didn't help too much,
although it was an important step forward.
I had some imput that the corruption was related with the upgrade or
installation of packages (and therefore manpages) and I was suspicious
of man's code for the database upgrade (which is automatically done when
man founds new manpages, like after an installation), but I found
nothing so bad.
Casually, it happened that I did the following (in that order):
- upgraded a package that installs manpages;
- issued a "info getext" command (that found nothing)
- tryed a man command and got a segfault.
the /var/catman/index.bt file was corrupted.
(if you want to try it out, better copy the index file somewhere before.
After you can simply copy it back and man will work again)
I studied info sources and found that in info/man.c it closes stdin and
stdout before exec-ing man.
<THAT> is the problem.
You can substitute the command "info getext" above with a line like
man getex 2<&- 1<&-
to get the same corruption. You can also simply "touch" a manpath
instead then installing new manpages:
sudo touch /usr/man/man1
man something 2<&- 1<&-
Normally, "man something" searches through all the manpaths before
saying "No manual entry for something". When it founds manpages more
recent than index files, it updates the indexes appending the new
entries to the index, through the db interface.
Doing that it outputs on stdout the message:
"Updating index cache for path `/usr/man'.Wait..."
Well the index file is being corrupted exactly from the inclusion of
that string, as a grep on the output of strings on a corrupted database
Well, maybe I'm tired, or I've relaxed because of the discovery, but now
I'm not able to understand what is happening.
Why the hell the closed stdout _and_ stderr produce a redirection to a
Can somebody help me to understand this?
| firstname.lastname@example.org email@example.com firstname.lastname@example.org
| Pluto Leader - Debian Developer & Happy Debian 1.3.1 User - vi-holic
| 6F7267F5 fingerprint 57 16 C4 ED C9 86 40 7B 1A 69 A1 66 EC FB D2 5E
> Just because Red Hat do it doesn't mean it's a good idea. [Ian J.]
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
email@example.com . Trouble?
e-mail to firstname.lastname@example.org .