Re: Release-critical Bugreport for June 23, 2000
Sami Haahtinen wrote:
> > Pronouncability implies a certian degree of regularity. I could file a
> > RC bug report stating that pwgen always includes vowels in its
> > passwords, but it seems likely it does so by design. I'm not sure that
> > this 'oo' thing isn't also be design.
>
> After a discussion about this with Itai Zukerman, i came to sort of agree
> with this, althought this might be a design issue, in finnish atleast it's
> not that much easier to pronounce 'oo' than any other two letter squence.
>
> I think that the person who reported the bug, wasn't english and didn't
> see a point here. (i can see the point but makes no difference to me)
>
> maybe this should be an option... this might be good to be downgraded
> to wishlist and add a comment to make this optional.
Unfortunatly, I did some more analysis, and it does look pretty bad.
I used a cheezy little perl program to calculate the numbers of times
adjacent pairs of letters appeared in words, both in
/usr/share/dict/words and in the output of pwgen.
joey@gumdrop:~>cat /usr/share/dict/words| perl -ne '$_=lc $_;
$len=length $_; for ($x=0; $x < $len-2; $x++) { $f{substr($_, $x, 2)}++
}; END { print map { $_="$f{$_}\t$_\n" } keys %f }' |sort -rn | head -20
42411 er
33655 in
31669 ti
29754 on
29403 te
28140 al
28121 an
27247 at
26482 ic
25006 en
24168 is
23906 re
23710 ra
23287 le
23204 ri
22363 ro
22044 st
21704 ne
21336 ar
20849 li
joey@gumdrop:~>pwgen 8 1000000| perl -ne '$_=lc $_; $len=length $_; for
($x=0; $x < $len-2; $x++) { $f{substr($_, $x, 2)}++ }; END { print map {
$_="$f{$_}\t$_\n" } keys %f }' |sort -rn | head -20
490478 oo
180797 th
140042 ho
126362 qu
125847 sh
125724 ng
125699 ch
114552 hi
89148 ha
84968 ee
84269 ae
76404 he
62049 ot
61360 go
55723 it
54609 os
54417 on
50000 is
49978 gi
49900 in
While all the other letter pairs that appeared near the top in frequency
in pwgen output appeared with outout the same frequency in real life in
the wordlist, oo is appearing about 6 times as often as it should.
--
see shy jo
Reply to: