Re: Precomposed Unicode layouts and permutations (was:Re: hello + UTF-8
On 8/7/05, Clytie Siddall <firstname.lastname@example.org> wrote:
> On 07/08/2005, at 5:44 PM, Steve Langasek wrote:
> > FWIW, I'm pretty sure there is no such thing as a precomposed
> > layout for
> > devanagari script; the combinatorics (pairing each possible vowel
> > sign with
> > each possible consonant character, plus arbitrary numbers of
> > combining forms
> > for consonant clusters) don't lend themselves to assigning a separate
> > Unicode codepoint for each combination, and indeed, I don't see any
> > sign of
> > these combos in Unicode.
> How many combinations are we talking about? With Vietnamese, the
> tones mean we have seventy-two vowels, which works for precomposed
> layouts for us. Without precomposed, until Level 2 Unicode is
> properly supported, we combined-diacritics languages have severe
> problems with consistent input and display across a range of software.
Valid point but Its not just vowels in question here.
A short introduction here :
Since Sanskrit (like most Indian languages) is highly phonetic in
nature, the Devanagari alphabet consists of a variety of short, long
and protracted vowels besides the 33 consonants. Some, consonants are
also semivowels (y,r,l,v) , sibilants (s,sh) and Sonant Aspirate (h).
Besides this, there are two nasal sounds : a dot and the second sound
is a dot within a semicircle placed above the letter after which it is
to be pronounced.
As Steve mentioned earlier, the compounding consonants can have many
# half-consonant + consonant,
# consonant + vowel,
# half-consonant + consonant + vowel,
In some conjuncts the component elements are scarcely discernable
while some are written in two ways and so on... 270 conjunct
consonants at the last count
(may be more but not less). Sandhi has a complex grammatical
structure and its a constant struggle (for me atleast).
So I guess this (^^) is a major reason why implementing precomposed
layout is not practical if not difficult to implement.
Hope that helps somewhat and sorry for the long post :)
|| स्वक्ष || svaksha