Bug#174982: [PROPOSAL]: Debian changelogs should be UTF-8 encoded
[ No need to CC me; I am subscribed to -policy ]
On Thu, 2003-01-02 at 00:23, David B Harris wrote:
> Could you provide a quick background about what Unicode is
Sure. Essentially Unicode is a universal character set, used to encode
all the world's languages, plus other symbols from mathematics and the
like. It is intended to supplant the other national charsets like
US-ASCII, ISO-8859-1 and BIG5 which are specific to the United States,
Western Europe, and China, respectively. Unicode makes
internationalization and multilingualization much easier.
> and how it
> co-operates with 7-bit ASCII?
The UTF-8 encoding of Unicode (translation from code point number into
sequence of bytes) is completely backwards compatible with US-ASCII, and
moreover no ASCII character appears as part of a multibyte character,
which makes it filesystem safe, for example.
See the URL I gave in the patch for more information:
http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2279.html
I hope that helps!
Reply to: