[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

remove chm2pdf from lenny



[this is a copy of #502702, because I misspelled X-Debbugs-CC]

Hi,

I've been looking at chm2pdf's RC bug.
There are so much more problems that releasing lenny seems totally out of question for
quality reasons.
Some items in addition to the tempfile RC bug (only the ones that have turned up in the
attempt to get it to convert some chm to pdf, I have not specifically looked at anything):
- the calls to subprograms using os.system are not using proper
  escaping, this is a security hazard (Raphael's patch proposed to fix the
  rm -f, the the rest is just as bad),
- the code looses perfectly good data in a way that cannot be turned off (note that the
  typical chm files I found that related to free software were api docs which are loosing
  a big deal without links to the right place on the page in per-module lists of
  functions):
  # Replace links of the form "somefile.html#894" with "somefile0206.html"
  # The following will match anchors like '<a href="temp0206.html#894"' and will store the
  # 'temp0206.html' in backreference 1.
  # The replace string will then replace it with '<a href="temp0206.html"', i.e. it will
  # take away the '#894' part.
  # This is because the numbers after the '#' are often wrong or non-existent. It is
  # better to link to an existing chapter than to a non-existent part of an existing
  # chapter.
- The implementation is inacceptably inefficient: The convert_to_pdf function's uses
  the following (match_strings and replace_garbled_strings are of length = number of
  pages) and this in in a loop over all pages. This is completely bogus (in terms of
  what they lengthily explain to try to achieve) and unacceptably inefficient, this can
  readily be implemented in linear time without effort.

     # Substitutions in 1st pass: we replace the original filenames with their
     # corresponding "garbled" equivalents.
        for match_string in  match_strings:
            replace_string = replace_garbled_strings[match_strings.index(match_string)]
            page = re.sub(match_string, replace_string, page)
     # Substitutuions in the 2nd pass: we replace the garbled filenames with the correct
     # ones.
        for match_string in  replace_garbled_strings:
            replace_string = replace_strings[replace_garbled_strings.index(match_string)]
            page = re.sub(match_string, replace_string, page)

chm2pdf never has been in a Debian release and it should not be before it gets better.
Please remove it from lenny.

Kind regards

T.
-- 
Thomas Viehmann, http://thomas.viehmann.net/


Reply to: