[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: why would "tr --complement --squeeze-repeats ..." append the substitution char once more? ...



On Mon, Dec 11, 2023 at 02:11:46PM +0100, tomas@tuxteam.de wrote:
> On Mon, Dec 11, 2023 at 07:42:10AM -0500, Greg Wooledge wrote:
> > Looks like GNU tr in Debian 12 still doesn't handle multibyte characters
> > correctly:
> > 
> >     unicorn:~$ echo 'mañana' | tr ñ X
> >     maXXana
> 
> Hey, you just gave us a handy way to count how many encoding units
> a character takes:
> 
>   tomas@trotzki:~$ echo 'birdie🐦here' | tr -c 'a-z' X
>   birdieXXXXhereX

Cute as that is, there are better ways.

    unicorn:~$ x=ñ; (echo "${#x}"; LC_ALL=C; echo "${#x}")
    1
    2


Reply to: