[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: How to read a word 7 file?



(Ted Harding) writes:
 > On 22-Jun-98 Luiz Otavio L. Zorzella wrote:
 > > 
 > > Hi, folks.
 > > 
 > > Is there any way to read in my linux box a word 7 .doc file?
 > > Mantaining the "indents" and "bolds" would be a plus, but mainly I
 > > just need to read the text in it.
 > > 
 > > I use StarOffice to read docs, but it only reads up to word 6 files
 > >:^<
 > 
 > A rough-and-ready way to do just what you're asking is to use the "strings"
 > command:
 > 
 >    strings wordfile.doc > wordfile.txt
 >

"strings" would do a good job for me, but...

 > and then edit wordfile.txt to clean it up. Raw "strings" will skip sequences of
 > fewer than 4 ASCII characters but these are unlikely to occur in a Word
 > document. This method will suppress all formatting info except end-of-line, so
 > you are likely to get long lines (= Word paragraphs). It will also fail to
 > recognise any non-US-ASCII character codes (above 127) so accented characters
 > and special symbols, etc, will be missed. But if you simply need to read the
 > text content of a Word document containing plain English text, then this method
 > works fine.

... my text is in portuguese, and does have non-US chars. Is there a
way to tell "strings" to accept some non-US chars?

Thanks.

-- 
Luiz Otavio L. Zorzella                 Product Engineer
zorzella@conexware.com          http://www.conexware.com


--  
To UNSUBSCRIBE, email to debian-user-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: