Re: European chars to ascii
On Fri, Aug 19, 2005 at 09:34:24AM -0400, Tong wrote:
>
> Is there any tools that can convert European characters to plain
> 7bit-Ascii?
>
> E.g., ä => a, ö => o, etc.
I don't know if there's a better tool, but I would do something like:
$ tr 'äöüß' 'aous' <isolatin1-in >ascii-out
(simply extend the char lists as required)
This only works with a 1-char => 1-char mapping. If you rather want
a 1-char => multiple-char mapping (e.g, in German, we'd typically
substitute ä => ae, ö => oe, etc.), you could start with a little
script like this
#!/usr/bin/perl
%mapping = (
'ä' => 'ae',
'ö' => 'oe',
'ü' => 'ue',
'ß' => 'ss',
# ...
);
$set = join '', map sprintf("\\x%x", ord $_), keys %mapping;
while (<>) {
s/([$set])/$mapping{$1}/ge;
print;
}
Or, if you'd like to specify the special characters' hex codes (in case
you have problems entering them directly...), you could write instead
#!/usr/bin/perl
%mapping = (
'e4' => 'ae',
'f6' => 'oe',
'fc' => 'ue',
'df' => 'ss',
# ...
);
$set = join '', map "\\x$_", keys %mapping;
while (<>) {
s/([$set])/$mapping{sprintf "%x", ord $1}/ge;
print;
}
Cheers,
Almut
P.S. Normally, you'd use iconv for encoding conversions. However,
"iconv -f 8859_1 -t ASCII isolatin1-file" doesn't work, because ASCII
can only represent a subset of characters present in 8859_1 -- which
makes iconv complain...
Reply to: