Re: A Bit of a Strange Situation
pdf has accessibility issues for screen reader users and riverwind and
me are both screen reader users. The best we can attempt is a text
extraction from pdf files if we're going to read what's in them. If
what was left in the file was a scanned image, maybe that can be scanned
on Windows I don't know that parallel capability exists with Linux yet.
Also, whenever text extraction gets done on pdf files with command line
tools with Linux there are spelling mistakes in the output. The pdf
format is just something those of us that can't see the screen would be
really happy if either Adobe had never come into existence or invented
that format. Also, knowledgeable sighted technical people I talk with
hate Adobe and pdf with a passion and they can't all be wrong.
On Thu, 25 Aug 2011, Curt wrote:
> On 2011-08-24, RiverWind <riverwind@shellworld.net> wrote:
> >
> > I have downloaded the linux cookbook, which consists of over five-
> > hundred html files. I am wanting to concatenate them all into one
> > big neat file, with all of the smaller files in perfect order. Now
> > I know that "cat" can do this, but the file naming protocol is a
> > bit strange. The names of the smaller files have to accommodate
> > both "parts" and "sections", which makes for an interesting naming
> > format. For instance, the first is named "cookbook1.html#SEC1." I
> > tried the following command.
>
> How 'bout just downloading one nice big neat pdf file?
>
> http://www.usinglinux.org/docu/guides/linuxcookbook-1.2.pdf
>
> You could convert that to html with 'pdftohtml'. Whether the resulting
> document would meet your rigorous standards, I dunno.
>
> If not, if you find the names of the html files in your possession
> inconvenient, why not rename them?
>
>
>
Jude <jdashiel@shellworld.net>
"I love the Pope, I love seeing him in his Pope-Mobile, his three feet
of bullet proof plexi-glass. That's faith in action folks! You know he's
got God on his side."
~ Bill Hicks
Reply to: