[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#782225: mksh: parameter expansion: string length ${#parameter} is incorrect on multibyte character



On 2015-04-09 12:04:43 +0000, Thorsten Glaser wrote:
> Vincent Lefevre dixit:
> 
> >In UTF-8 based locales:
> 
> >$ mksh -c 'a=$(/usr/bin/printf \\u00e9); echo $a ${#a}'
> 
> tglase@tglase:~ $ mksh -c 'a=$(/usr/bin/printf \\u00e9); echo $a ${#a}'
> é 2
> tglase@tglase:~ $ mksh -Uc 'a=$(/usr/bin/printf \\u00e9); echo $a ${#a}'
> é 1
> 
> This works as specified for mksh: UTF-8 mode is disabled by
> default in scripts or for -c to not break any existing scripts.

I think that it should enable this option in posix and sh modes.
The following is even more unexpected:

$ lksh -o sh -c 'a=$(/usr/bin/printf \\u00e9); echo $a ${#a}'
é 2

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Reply to: