[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: PS to HTML?



On Tue, Jan 23, 2001 at 11:07:21PM -0800, Eric G . Miller wrote:
> Not really easy to do.  PostScript has a lot of stuff that simply won't
> translate to html easily or at all.

I'm not really interested in the formatting, just something that will extract
the text of the PS doc and insert some reasonable markup into it so that I
don't have to do it manually.  (I'd still expect to need to clean it up, but
even a zeroth approximation would be nice.)

> I've been to at least one site that has documents on-
> line as a series of images -- one per page -- with links between pages,
> the top and bottom.  Not beautiful, but actually functional and legible.

I've seen that sort of thing too, and it drives me nuts.  I'm a believer in
the theory that textual information should be available online as _text_, not
just as images.  (A picture of a word is not worth a thousand words, but it's
almost as big...)

Anyhow, a couple other people have pointed me at tools that may be
appropriate.  I'll check them out and report my results.

BTW, anyone know what's up with pstotext?  I ran a PS doc through it last
night and there were a lot of extra spa ces in  the outpu t, including many
in mid-word.  Is this preventable?

-- 
SGI products are used to create the 'Bugs' that entertain us in theatres
and at home. - SGI job posting
Geek Code 3.1:  GCS d? s+: a- C++ UL++$ P++>+++ L+++>++++ E- W--(++) N+ o+
!K w---$ O M- V? PS+ PE Y+ PGP t 5++ X+ R++ tv b+ DI++++ D G e* h+ r y+



Reply to: