[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

WARNING: potato has horrible broken locales



Hi,

Some time ago I discoverd a problem with sort (from textutils). It doesn't
work for me :(. Maintainger of textutils package wrote me that is problem
only with my (pl_PL) locale. After that he discovered that even en_AU locale
is broken.

This bug (#69544) has been reassigned to libc6.

Today I've trying (in bash):

  ls /dev/tty[a-z]0

and answer has unexpected /dev/ttyI0 and /dev/ttyS0 followed by  /dev/tty[a-z]0
entries.

Than I wrote simple script (attached at the end) to generate file with
characters from 0000 to 0377 range (one per line with octal number, like:
c101=A). Than it seeks (using grep) for [a-c] range with all locales (locale
-a) and counts lines.

Only locales listed below give count=3 (this may be also not correct):

C ca cs da de el en eo eo_EO es es_AR es_DO es_GT es_HN es_MX es_PA es_PE
es_SV et eu fi fr ga gl gl_ES hr hu id it ja ja_JP.sjis ja_JP.ujis japanese
japanese.euc ko lt nl no no@nynorsk pl POSIX pt ro ru sk sl sr sv tr uk wa X
zh zh_CN zh_TW.Big5

All other give different values.

For 'pl_PL' count is 15, proper value is 4 (a a_ogonek b c) also 3 (a b c)
could be acceptable. 

'de' gives 3 but 'de_*' and 'deutsch' give also 20 - as I know proper value
is 4 (a a_umlaut b c).

I think that even all 'en_*' locales are broken (count=20), proper value
should be 3.

I'll fill grave bug against libc6 - it breaks all potato - I'm stupid?

Please comment.

Mirek


#!/bin/bash

unset LANG
unset LC_ALL
unset LC_CHARSET
unset LC_COLLATE
unset LC_CTYPE
unset LC_MESSAGES
unset LC_MONETARY
unset LC_NUMERIC
unset LC_TIME

for c in `seq 0 3`; do 
for b in `seq 0 7`; do 
for a in `seq 0 7`; do

  echo -e c$c$b$a=\\$c$b$a

done; done; done >/tmp/char.list


for l in `locale -a`; do 

  export LANG=$l
  c=`grep -a '=[a-c]' /tmp/char.list |wc -l` 
  echo $c $l;

done > /tmp/lang.list

Reply to: