[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1051748: RFP: pdf2htmlex -- convert PDF to HTML without losing text or format



Hi,

Вт 12 сен 2023 @ 13:01 Johannes Schauer Marin Rodrigues <josch@debian.org>:

> Hi,
>
> On Tue, 12 Sep 2023 10:57:57 +0500 Lev Lamberov <dogsleg@debian.org> wrote:
>> Package: wnpp
>> Severity: wishlist
>> 
>> * Package name    : pdf2htmlex
>>   Version         : 0.18.8rc1
>>   Upstream Author : Lu Wang <coolwanglu@gmail.com> and other contributors
>> * URL or Web page : https://github.com/pdf2htmlEX/pdf2htmlEX
>> * License         : GPL-3+
>>   Description     : convert PDF to HTML without losing text or format
>> 
>
> you are aware that pdf2htmlex used to be part of Debian? It is still in
> old-old-stable. It was removed with this bug:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=921471
>
> Are you sure it is wise to include that package into Debian again? The issue
> tracker is very low on activity:
>
> https://github.com/pdf2htmlEX/pdf2htmlEX/issues
>
> Thanks!

Well, the upstream is indeed not very active (the latest commit is on 13
Mar this year). I admit that it looks more like an abandonware, but
probably someone™ could step forward and care for it (I personally lack
the relevant competence). Recently I had to convert LaTeX source (XeTeX,
in fact) to HTML and in fact the best result I got was with
LaTeX->PDF->HTML, where the last convertion was done with this
pdf2htmlex. It produced HTML document which looks exactly like PDF
produced by LaTeX. So, I thought that this tool can be of use to someone
else.

Cheers!
Lev


Reply to: