Re: [g-i] Arabic / Persian fonts
> General question: starting with an UTF-8 encoded string/file, what is the
> easiest way to get the hex UTF-8 codes of the characters in it?
Could the attached script (written by Denis Barbier and which I use for
writing locales) help ?
You need to put the string you want to extract the hex codes from
between quotes:
bubulle@cc-mykerinos:~/tmp> cat test
"This is a test"
bubulle@cc-mykerinos:~/tmp> cat test | utf2uxx
"<U0054><U0068><U0069><U0073><U0020><U0069><U0073><U0020><U0061><U0020><U0074><U0065><U0073><U0074>"
Dunno if this is what you're seeking for, though...
--
#! /usr/bin/perl -C1
sub c {
my $text = shift;
my $convert_ascii = shift;
my $ret = '';
while ($text =~ s/(.)//) {
$l = unpack("U", $1);
if ($convert_ascii == 0 && $l < 0x80) {
$ret .= $1;
} else {
$ret .= sprintf "<U%04X>", $l;
}
}
return $ret;
}
my $convert_ascii = 1;
while (<>) {
if (/^LC_IDENTIFICATION/) {
$convert_ascii = 0;
} elsif (/^END LC_IDENTIFICATION/) {
$convert_ascii = 1;
}
my $conv = $convert_ascii;
$conv = 0 if (/^(copy|include)/);
s/"([^"]*)"/'"'.c($1, $conv).'"'/eg;
print;
}
Reply to: