Christian Perrier wrote:
Le samedi 23 août 2008 à 19:59 -0500, Adam Majer a écrit :Package: gedit Version: 2.22.3-1 Severity: normal The following UTF-8 string is not correctly handled in gedit, const char *unicode_insert = "?Э"; The " and the ? characters are viewed as one character, making the entire thing next to impossible to copy/paste/edit.Looks like an issue in pango, since it is not specific to gedit. Such things seem to happen a lot when using Tibetan characters, so this may or may not be intentional. I’d prefer to have the input of someone who uses them. Is there anyone on debian-i18n who’s more knowledgeable about Tibetan glyphs?Adding Pema Geyleg and Tenzin Dendup, our fellow Dzongkha translation coordinators, who certainly have skills about Tibetan-family scripts (Dzongkha is one of these) and could maybe point you to people with needed knowledge.
I'm sorry, but aren't we missing the entire point here? This is not about bad handling of some Tibetan characters. It is about bad handling of 3-byte UTF-8 characters. http://en.wikipedia.org/wiki/UTF-8 So, the following characters should have the same problems, "ऄक "ঈউঊ "ਜਗਏ "ଜଁଂ "ஔ "ంఁః "ಂಖ "ഈഃ etc..I've put a Ascii " in front of all the different characters. In emacs, I'm able to select the " in front of these characters and copy it. vim under a UTF-8 gnome terminal also allows the " to be selected. The 2nd last line above (using icedove), I can't independently select the " but I can select the " and ಂ together and then remove the 2nd character.
Maybe it is just my misunderstanding of UTF-8, I'm not sure. But at least my expected behaviour was being able to select 1 UTF-8 character at a time, even if linguistically it does not make any sense.
- Adam