[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How to convert html to PDF?



Quoting Jonas Smedegaard (2017-01-30 22:35:13)
> Quoting Jonas Smedegaard (2017-01-30 18:28:11)
>> Quoting Shrinivasan T (2017-01-30 17:52:29)
>>> I am looking for a solution to convert HTML to PDF with custom Tamil 
>>> language fonts and custom paper size.
> [...]
>> You might also try pandoc (unlikely to work without fine-tuning, but 
>> if it works then you can do powerful things like scraping a web page 
>> and apply a LaTeX template to produce professional-grade output - 
>> like I did with http://source.jones.dk/eut.git/ to produce 
>> http://eut.biks.dk/ - an 60+ pages research study edited on a wiki 
>> and finalized as PDF books optimized for print and "ebook-style" use.
>
> This - using XeLaTeX and XeTeX internally - seems to work:
>
>   pandoc --standalone --latex-engine xelatex -V lang='' -V papersize=a5 -V mainfont="Uni Ila.Sundaram-10" -V margin-left=5mm -V margin-right=5mm -V margin-top=10mm -V marginbottom=15mm --output manaosai-xetex-ebook.pdf https://ia800203.us.archive.org/31/items/ManaosaiShortStories/Manaosai-short-stories.html

Above is my favorite, and I would appreciate feedback from others on the 
quality of Tamil font rendering (e.g. kerning and whether my suspicion 
about unsupported characters is correct).

This seems to work too:

cutycapt --user-style-string="body{font-family:'Uni Ila.Sundaram-10'}" --url=https://ia800203.us.archive.org/31/items/ManaosaiShortStories/Manaosai-short-stories --out=cutycapt.pdf

...but like wkhtmltopdf is limited - e.g. I cannot set page size.

python-pisa is currently broken: http://bugs.debian.org/852363 - and 
likely won't handle Tamil anyway: http://bugs.debian.org/610390


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

Attachment: signature.asc
Description: signature


Reply to: