This is another example from the Sinhala fonts "ෝ 2 characters. " followed by U+0DDD - Adam
The following UTF-8 string is not correctly handled in gedit, const char *unicode_insert = "?Э"; The " and the ? characters are viewed as one character, making the entire thing next to impossible to copy/paste/edit.
The whole bug can be read on http://bugs.debian.org/496266