Re: [g-i] Arabic / Persian fonts

To: debian-boot@lists.debian.org
Subject: Re: [g-i] Arabic / Persian fonts
From: Christian Perrier <bubulle@debian.org>
Date: Wed, 8 Feb 2006 22:16:12 +0100
Message-id: <[🔎] 20060208211612.GI5287@djedefre.onera>
In-reply-to: <[🔎] 200602082037.59440.aragorn@tiscali.nl>
References: <20060130225255.GD3862@slurp> <[🔎] 200602011959.03538.aragorn@tiscali.nl> <[🔎] 60381eeb0602012252v3fb454a2xce38898fe7ad69db@mail.gmail.com> <[🔎] 200602082037.59440.aragorn@tiscali.nl>

> General question: starting with an UTF-8 encoded string/file, what is the 
> easiest way to get the hex UTF-8 codes of the characters in it?

Could the attached script (written by Denis Barbier and which I use for
writing locales) help ?

You need to put the string you want to extract the hex codes from
between quotes:

bubulle@cc-mykerinos:~/tmp> cat test
"This is a test"
bubulle@cc-mykerinos:~/tmp> cat test | utf2uxx
"<U0054><U0068><U0069><U0073><U0020><U0069><U0073><U0020><U0061><U0020><U0074><U0065><U0073><U0074>"


Dunno if this is what you're seeking for, though...


--

#! /usr/bin/perl -C1

sub c {
	my $text = shift;
	my $convert_ascii = shift;
	my $ret = '';
	while ($text =~ s/(.)//) {
		$l = unpack("U", $1);
		if ($convert_ascii == 0 && $l < 0x80) {
			$ret .= $1;
		} else {
			$ret .= sprintf "<U%04X>", $l;
		}
	}
	return $ret;
}

my $convert_ascii = 1;
while (<>) {
	if (/^LC_IDENTIFICATION/) {
		$convert_ascii = 0;
	} elsif (/^END LC_IDENTIFICATION/) {
		$convert_ascii = 1;
	}
	my $conv = $convert_ascii;
	$conv = 0 if (/^(copy|include)/);
	s/"([^"]*)"/'"'.c($1, $conv).'"'/eg;
	print;
}

Reply to:

References:
- Re: [g-i] Arabic / Persian fonts
  - From: Frans Pop <aragorn@tiscali.nl>
- Re: [g-i] Arabic / Persian fonts
  - From: Eddy Petrişor <eddy.petrisor@gmail.com>
- Re: [g-i] Arabic / Persian fonts
  - From: Frans Pop <aragorn@tiscali.nl>

Prev by Date: Re: [g-i] Arabic / Persian fonts
Next by Date: Bug#347479: Seems to not handle VG creation correctly
Previous by thread: Re: [g-i] Arabic / Persian fonts
Next by thread: Re: [g-i] Arabic / Persian fonts
Index(es):
- Date
- Thread