Re: Dictionary changes

To: debian-user@lists.debian.org
Subject: Re: Dictionary changes
From: Bzzzz <lazyvirus@gmx.com>
Date: Wed, 2 Jul 2014 18:40:18 +0200
Message-id: <[🔎] 20140702184018.6aabc27b@anubis.defcon1>
In-reply-to: <[🔎] 20140702122202.3ae6dd35@mydesq2.domain.cxm>
References: <[🔎] 20140702122202.3ae6dd35@mydesq2.domain.cxm>

On Wed, 2 Jul 2014 12:22:02 -0400
Steve Litt <slitt@troubleshooters.com> wrote:

> Another thing to remember is that the wordlist is no longer ASCII,

Excellent thing at the age of UTF-N.

> cat /usr/share/dict/words | grep -i "$1"

Simplify it: grep -i "$1" /usr/share/dict/words

> If you look up ^smor.*rd$, you get nothing. But if you look up
> ^sm.*rd$ you get smörgåsbord. What I'd like to do is get grep to
> think "å" is a hit for "a" and report it, but report it as "å".
> I'll let you know when I figure out how to do that, or do some
> other thing that produces the same result. Prepending LC_ALL=
> either C, C.UTF-8, en_US.utf8, or POSIX, to the grep command,
> didn't do it either.

You can't, 'cos these letters do not have the same code
in either encoding.
(But your case is interesting; may be a rewritten grep,
including conversions, would be of interest).
 
> If worst comes to worst and I can't find a way to get grep to do
> this, I'll just put together a substitution table,
> convert /usr/share/dict/words to words.ascii, line for line, search
> words.ascii, get the line number, and pull that line out of words.
> Crude, but effective.

AFAIK, this is the only way to be able to perform what you want.

-- 
To be is to do.	-- I. Kant
To do is to be.	-- A. Sartre
Do be a Do Bee!	-- Miss Connie, Romper Room
Do be do be do!	-- F. Sinatra
Yabba-Dabba-Doo! -- F. Flintstone

Attachment: signature.asc
Description: PGP signature

Reply to:

Follow-Ups:
- Re: Dictionary changes
  - From: Steve Litt <slitt@troubleshooters.com>

References:
- Dictionary changes
  - From: Steve Litt <slitt@troubleshooters.com>

Prev by Date: Dictionary changes
Next by Date: Re: Dictionary changes
Previous by thread: Dictionary changes
Next by thread: Re: Dictionary changes
Index(es):
- Date
- Thread