[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1123509: xterm: various issues with the U+FE0F VARIATION SELECTOR-16 (VS16) character after emoji



On Wed, Dec 17, 2025 at 04:50:45AM +0100, Vincent Lefevre wrote:
> Package: xterm
> Version: 405-1
> Severity: important
> 
> With xterm 405, the use of the U+FE0F VARIATION SELECTOR-16 (VS16)
> character after an emoji can completely corrupt the display with
> Mutt. GNU Screen also gets broken with the command below (issues
> with the last line of the terminal). I suspect that this is due
> to an inconsistency between the xterm behavior and wcwidth(),
> which may affect various applications that rely on wcwidth().

Without the Emoji width feature (which as I mentioned, I see should be
configurable), xterm's wcwidth is a close match for glibc's wcwidth.
The few differences which I noticed in testing appear to be problems with
glibc.

Checking now, mutt has a wcwidth.c, which is not often used (since it's
compile-time), which is just as well because it's tables are very old.
It has a wrapper for wcwidth which makes some assumptions about iswprint
that make its behavior problematic except with glibc.

> I have not checked wcswidth().

nor I - actually I don't believe it is often used.
mutt imitates it by repeatedly calling wcwidth, and doesn't account for VS16.

Because mutt isn't accounting for VS16, that's an issue for which xterm
"should" be configurable, so we can accommodate programs which pass through
VS15 and VS16 without accounting for their behavior.

(I haven't investigated "neomutt", which may provide improvements, though
the "neo" cult appears to rely heavily upon hard-coding).
 
> But there are issues even with simple output. In a 80-column terminal:
> 
>   perl -C -e 'print "\x{2642}\x{FE0F}"x60, "\n"'

perl's yet another pitfall.  In developing #404, I looked into the wcwidth
data used in NetBSD/OpenBSD, which reportedly is tied to perl.  That ignores
the East Asian stuff entirely, and doesn't match glibc very well.

For your example, perl's irrelevant though - this is just bits...
 
> I get "♂♂" in the last two columns, which is inconsistent with what
> is output before. And in case of scrolling, the spaces are missing
> in the second line.

xterm's handling fullwidth characters by putting a non-character in the
second cell.  In handling VS16, I may have overlooked some path for doing
that (something to investigate).  But the behavior in mutt was consistent
with my expectation: an extra "blank" cell.
 
> And selection/deselection of such output gives random behavior.
> In particular:
>   * Spurious ♂ characters appear where there were spaces.
>   * Some ♂ characters still appear in reverse video after everything
>     has been unselected.
> 
> -- System Information:
> Debian Release: forky/sid
>   APT prefers unstable-debug
>   APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable-debug'), (500, 'proposed-updates-debug'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
> Architecture: amd64 (x86_64)
> 
> Kernel: Linux 6.7.12-amd64 (SMP w/16 CPU threads; PREEMPT)
> Kernel taint flags: TAINT_WARN
> Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
> Shell: /bin/sh linked to /usr/bin/dash
> Init: systemd (via /run/systemd/system)
> LSM: AppArmor: enabled
> 
> Versions of packages xterm depends on:
> ii  libc6           2.42-6
> ii  libfontconfig1  2.15.0-2.4
> ii  libfreetype6    2.13.3+dfsg-1
> ii  libice6         2:1.1.1-1
> ii  libtinfo6       6.5+20251123-1
> ii  libutempter0    1.2.1-4
> ii  libx11-6        2:1.8.12-1
> ii  libxaw7         2:1.0.16-1
> ii  libxext6        2:1.3.4-1+b3
> ii  libxft2         2.3.6-1+b4
> ii  libxinerama1    2:1.1.4-3+b4
> ii  libxmu6         2:1.1.3-3+b4
> ii  libxpm4         1:3.5.17-1+b3
> ii  libxt6t64       1:1.2.1-1.3
> ii  xbitmaps        1.1.1-2.2
> 
> Versions of packages xterm recommends:
> ii  luit [luit]  2.0.20250912-1
> ii  x11-utils    7.7+7
> 
> Versions of packages xterm suggests:
> pn  xfonts-cyrillic  <none>
> 
> -- no debconf information
> 
> -- 
> Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
> Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
> 

-- 
Thomas E. Dickey <dickey@invisible-island.net>
https://invisible-island.net

Attachment: signature.asc
Description: PGP signature


Reply to: