8-bit safe text utils?
I'm trying to process some 8-bit text on my Debian system and it's
giving me fits. Clearly some of the programs I'm piping things
through aren't 8-bit aware. Can someone point me to a good listing of
these and/or to a discussion of how to work around the limitations of
the system.
Here's an example of what I'm doing. The input is an official Dutch
word list called "woor-den.max" and the output is to be a compressed
dictionary to be included with a free Scrabble clone I'm developing
for the PalmOS platform.
The words include a character (octal 0267) that indicates hyphenation.
I want to pull it out. If in the bash shell (either running in emacs
via shell mode or in xterm; it doesn't matter) I type
# tr -d "\267" < woor-den.max
tr does nothing. But if I save the same command as a bash shell
script and execute it I get the desired result.
Working with grep's the same way.
This can't be an unfamiliar problem for those of you across the Atlantic.
What's the best coping strategy?
Thanks!
--Eric House
/******************************************************************************
* Sun .signature deleted: this isn't a Sun project!
******************************************************************************/
Reply to: