[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Convert HTML to PDF from CLI?



On Mon, May 11, 2009 at 05:30, Dotan Cohen <dotancohen@gmail.com> wrote:
> I need to convert an HTML document to PDF from the CLI. Currently, I
> am using a Firefox extension to do this:
> http://torisugari.googlepages.com/commandlineprint2
>
> However, this has many drawbacks and I would like to remove the
> dependency on Firefox. These are other solutions that I have looked
> into, and their problems:
>
> 1)  CUPS
> Prints the HTML, not the formatted output. It treats the whole document as text.

Expected, since html is text, and cups has no html parser

> 2) html2ps
> Does not work with UTF-8 text!

That is just pathetic in this day and age.

> 3) Open Office
> Requires either a macro or a wrapper script, both which are
> problematic and change each new OOo version (which I update often).

Definitely a problem.

> 4) convert (from imagemagick)
> Does not support UTF-8!

Sad.

> 5) gnome-web-print
> Slight formatting problems with HTML tables. However, it is my
> runner-up so far as it is the only other solution that works at all.
>
>
>
> Any other ideas? Is there a konqueror- or KDE way to do this? Am I
> missing something obvious? Thanks!

a2ps can use lynx, but I doubt you want lynx-like output.

There are some headless "browsers" out there, but most seem
to be about testing, not output or they output images, not ps/pdf.

There is some discussion here:
http://www.holovaty.com/writing/headless-html-rendering-engine/

Sounds like someone needs to write a CLI app that makes use of
a headless Webkit.


Cheers,
Kelly Clowers


Reply to: