On Tue, May 11, 2004 at 01:01:16PM -0400, Matt Price wrote:
On Tue, May 11, 2004 at 11:30:11AM -0400, Ralph Katz wrote:
thanks for the flues folks. pdftohtml -- which I confess I *did*
already know about, sorry, should havesaid so -- won't work so well
for me, i odn't think; these are scanned-in texts from the jstor
journal collection, and it's important I keep the pages in order...
as ,er, someone mentioned earlier (don't have the thread in front of
me at the moment), a complex process involving gimp and pdftops seems
to be the best bet, but it's insanely labour-intensive for long
documents, so I may forego the whole project. thx all though.
Well, if you have scanned all the pages in about the same position,
and you can establish reassonably well the coordinates of the crop,
you can write a script that does all the work in one step (containing
all the inner steps).