[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#872778: xterm -lc (with UTF-8 locale) cannot properly copy some utf-8 unicode chars



On Mon, Sep 17, 2018 at 04:47:28AM -0400, Thomas Dickey wrote:
> On Wed, Aug 22, 2018 at 08:05:51PM +1000, Zenaan Harkness wrote:
> > Create a text file containing e.g. the musical natural symbol, and
> > the mathematical function symbol, e.g. "ƒƒƒ ♮♮♮" (three function
> > symbols, a space, and three natural symbols, inside plain quotes).
> > 
> > Now in an xterm -lc instance, with a UTF-8 locale, cat the file.
> > 
> > xterm displays the function and the natural symbols.
> > 
> > Now start the utf-8 compatible gui editor Geany, and open the same
> > file in Geany.
> > 
> > Copy and paste those characters from Geany, into Geany - works.
> > 
> > Copy from Geany, paste to xterm - this also works.
> > 
> > Select/copy from xterm, middle-click paste into Geany - only the
> > natural symbols, and not the function symbols, are pasted, also
> > pasting to xterm (from copying from xterm) does not work.
> > 
> > SO, xterm is not properly copying some UTF-8 Unicode characters.
> 
> This update is unrelated to the original report, which deals with
> characters past BMP (the example uses U+0192 and U+266E).
> 
> I have not been able to reproduce the problem.
>  
>  See also:
> > https://lists.debian.org/debian-user/2017/09/msg00518.html
> > https://lists.debian.org/debian-user/2017/09/msg00527.html
> > 
> > Should I file a different bug for this, or just leave this here?
> 
> It might be related to #901249, but I cannot say.  The other client
> (Geany) seems to be a factor - if you can reproduce the problem with
> xsel, that would be helpful.  copy and paste rely on the source to
> provide the data in different formats, and the target to request
> what's appropriate.

OK, so I've tested just using xsel:

The string I start with is "# ƒƒ ♮♮" without the quotes, and that
should appear as:

hash space function function space natural natural

In vim in xfce4-terminal (to write this email), that sequence pastes
correctly.

Now, in xfce4-terminal, after selecting those chars, xsel -o
correctly dumps them.

Jumping immediately to xterm -lc, then:

  xsel -o -also- correctly dumps those chars to the xterm.

That's good.

Next, select those chars in xterm, and xsel -o no longer dumps the
function symbols;

That's not good.

xfce4-terminal now has the same problem with xsel -o NOT dumping the
function symbols, as does middle click pasting into geany -
SO, in my setup at least, the problem is copying the function symbol
-from- xterm (copying from other apps, such as geany and from vim in
xfce4-terminal, and straight from xfce4-terminal, all works
correctly for xsel -o (in both xfce4-terminal and xterm -lc).

According to https://en.wikipedia.org/wiki/%C6%91 this "function
symbol" is actually called the "florin sign", but in any case has the
code U+0192 which seems well within the 16-bit code plane.


Here's what a little test run looks like in xterm -l (I've bound the
function symbol to my keyboard so I can type it successfully):

$ echo ƒƒƒƒ
ƒƒƒƒ
$ # select above string, and:
$ xsel -o
$ 
$ # now middle click:
$ ?????^C
$ # now select from xfce4-terminal, then come back here:
$ xsel -o
ƒƒƒƒ$ 
$ # now middle click:
$ ƒƒƒƒ



Thomas is there any other test I can run on Debian stable?


Reply to: