Bug#1051748: RFP: pdf2htmlex -- convert PDF to HTML without losing text or format
Hi,
Вт 12 сен 2023 @ 13:01 Johannes Schauer Marin Rodrigues <josch@debian.org>:
> Hi,
>
> On Tue, 12 Sep 2023 10:57:57 +0500 Lev Lamberov <dogsleg@debian.org> wrote:
>> Package: wnpp
>> Severity: wishlist
>>
>> * Package name : pdf2htmlex
>> Version : 0.18.8rc1
>> Upstream Author : Lu Wang <coolwanglu@gmail.com> and other contributors
>> * URL or Web page : https://github.com/pdf2htmlEX/pdf2htmlEX
>> * License : GPL-3+
>> Description : convert PDF to HTML without losing text or format
>>
>
> you are aware that pdf2htmlex used to be part of Debian? It is still in
> old-old-stable. It was removed with this bug:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=921471
>
> Are you sure it is wise to include that package into Debian again? The issue
> tracker is very low on activity:
>
> https://github.com/pdf2htmlEX/pdf2htmlEX/issues
>
> Thanks!
Well, the upstream is indeed not very active (the latest commit is on 13
Mar this year). I admit that it looks more like an abandonware, but
probably someone™ could step forward and care for it (I personally lack
the relevant competence). Recently I had to convert LaTeX source (XeTeX,
in fact) to HTML and in fact the best result I got was with
LaTeX->PDF->HTML, where the last convertion was done with this
pdf2htmlex. It produced HTML document which looks exactly like PDF
produced by LaTeX. So, I thought that this tool can be of use to someone
else.
Cheers!
Lev
Reply to: