[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#803342: dhelp: Weekly cron job terminates, doesn't create 'documents.index'



Hi,

I am also affected by this bug.  Here is the e-mail I receive every week (the complete e-mail is 16MB !):

/etc/cron.weekly/dhelp:
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::erase: __pos (which is 18446744073709551615) > this->size() (which is 0)
Dhelp::IndexerError: Broken pipe indexing /usr/share/doc/muscle/muscle.html
, /usr/share/doc/muscle/muscle.html
[...]
, /usr/share/doc/libatlas-doc/atlas_devel.pdf.gz
, using /usr/bin/index++ --config-file /usr/share/dhelp/config/swish++.conf --index-file /var/lib/dhelp/documents.index --follow-links - (/usr/lib/ruby/vendor_ruby/dhelp.rb:616:in `rescue in index'
/usr/lib/ruby/vendor_ruby/dhelp.rb:609:in `index'
/usr/sbin/dhelp_parse:171:in `do_deferred_indexing'
/usr/sbin/dhelp_parse:205:in `main'
/usr/sbin/dhelp_parse:221:in `<main>')

A quick way to reproduce this bug is to run this command:

# echo /usr/share/doc/libatlas-doc/atlas_devel.pdf.gz | /usr/bin/index++ --config-file /usr/share/dhelp/config/swish++.conf --index-file index  -
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::erase: __pos (which is 18446744073709551615) > this->size() (which is 0)
Aborted (core dumped)

This should make it easier to debug for someone who is familiar with swish++.

My guess is that files with double extensions (.pdf.gz) are not handled correctly and then index++ is confused by binary data and crashes.

This can be seen by running:

# echo /usr/share/doc/libatlas-doc/atlas_devel.pdf.gz | strace -f /usr/bin/index++ --config-file /usr/share/dhelp/config/swish++.conf --index-file index  -

and in the strace log, I see:

read(0, "/usr/share/doc/libatlas-doc/atla"..., 4096) = 47
stat("/usr/share/doc/libatlas-doc/atlas_devel.pdf.gz", {st_mode=S_IFREG|0644, st_size=253989, ...}) = 0
lstat("/usr/share/doc/libatlas-doc/atlas_devel.pdf.gz", {st_mode=S_IFREG|0644, st_size=253989, ...}) = 0
unlink("/var/lib/dhelp/tmp/atlas_devel.pdf") = -1 ENOENT (No such file or directory)

which means that /var/lib/dhelp/tmp/atlas_devel.pdf is never created.

-- 
Laurent.


Reply to: