automate printing of html-formatted pages?

To: debian-user@lists.debian.org
Subject: automate printing of html-formatted pages?
From: Matt Price <matt.price@utoronto.ca>
Date: Fri, 02 Sep 2005 21:51:59 -0400
Message-id: <[🔎] 431901BF.4090900@utoronto.ca>

Ho folks,

My partner needs to print out copies of all the content in her
mid-sized, statically-generated website (I know this is a stupid idea,
but it's for her tenure file and there are lots and lots of stupid
elements in this process).  This seems like something one ought to be
able to do automatically, e.g. with:

wget -m -k http://some.website.com/

and then:

#! /bin/bash
find /path/to/top/level -type f -iname *.html | while read file; do
html2ps -gn $file > "$file".ps ;
done
find /path/to/top/level -type f -iname *.html | while read psfile; do
lpr $psfile
done

unfortunately, this doesn't work very well -- among other things,html2ps does some very strange things with the layout of the pages,apparently trying to cram all the text even in very long pages into asingle 8.5x11 sheet of paper -- I've posted an example athttp://www.racesci.org/test.ps (the original html file is athttp://www.racesci.org/bibliographies/current_scholarship/sigerist.html ).

Presumably this has something to do with the rendering bug mentioned inthe html2ps man page:


      Rendering HTML tables well is a  non-trivial  task.
       For  "real" tables, that is representation of tabu-
       lar data, html2ps usually generates reasonably good
       output.  When  tables are used for layout purposes,
       the result varies from good  to  useless.  This  is
       because  a table cell is never broken across pages.
       So if a table contains a cell with a  lot  of  con-
       tent,  the  entire table may have to be scaled down
       in size in order to make this cell fit on a  single
       page.  Sometimes this may even result in unreadable
       output.

OK, I can see this is difficult to do. But is there anothercommand-line solution to my problem?


Thanks as always for your suggestions.

Matt

Reply to:

Follow-Ups:
- Re: automate printing of html-formatted pages?
  - From: Alvin Oga <aoga@mail.Linux-Consulting.com>

Prev by Date: Re: Apache2 & MySQL
Next by Date: Do we still need libc5?
Previous by thread: Re: IDE mondorestore DVD to Perc4 Raid machine
Next by thread: Re: automate printing of html-formatted pages?
Index(es):
- Date
- Thread