WARNING: potato has horrible broken locales
Hi,
Some time ago I discoverd a problem with sort (from textutils). It doesn't
work for me :(. Maintainger of textutils package wrote me that is problem
only with my (pl_PL) locale. After that he discovered that even en_AU locale
is broken.
This bug (#69544) has been reassigned to libc6.
Today I've trying (in bash):
ls /dev/tty[a-z]0
and answer has unexpected /dev/ttyI0 and /dev/ttyS0 followed by /dev/tty[a-z]0
entries.
Than I wrote simple script (attached at the end) to generate file with
characters from 0000 to 0377 range (one per line with octal number, like:
c101=A). Than it seeks (using grep) for [a-c] range with all locales (locale
-a) and counts lines.
Only locales listed below give count=3 (this may be also not correct):
C ca cs da de el en eo eo_EO es es_AR es_DO es_GT es_HN es_MX es_PA es_PE
es_SV et eu fi fr ga gl gl_ES hr hu id it ja ja_JP.sjis ja_JP.ujis japanese
japanese.euc ko lt nl no no@nynorsk pl POSIX pt ro ru sk sl sr sv tr uk wa X
zh zh_CN zh_TW.Big5
All other give different values.
For 'pl_PL' count is 15, proper value is 4 (a a_ogonek b c) also 3 (a b c)
could be acceptable.
'de' gives 3 but 'de_*' and 'deutsch' give also 20 - as I know proper value
is 4 (a a_umlaut b c).
I think that even all 'en_*' locales are broken (count=20), proper value
should be 3.
I'll fill grave bug against libc6 - it breaks all potato - I'm stupid?
Please comment.
Mirek
#!/bin/bash
unset LANG
unset LC_ALL
unset LC_CHARSET
unset LC_COLLATE
unset LC_CTYPE
unset LC_MESSAGES
unset LC_MONETARY
unset LC_NUMERIC
unset LC_TIME
for c in `seq 0 3`; do
for b in `seq 0 7`; do
for a in `seq 0 7`; do
echo -e c$c$b$a=\\$c$b$a
done; done; done >/tmp/char.list
for l in `locale -a`; do
export LANG=$l
c=`grep -a '=[a-c]' /tmp/char.list |wc -l`
echo $c $l;
done > /tmp/lang.list
Reply to: