[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Grep on dictionary words

> ISTM that because the output of strings is not discrete list of
> potential words, but is instead a long list of concatenated
> characters, this problem is really rather daunting. The output should
> probably be first broken up into something resembling words by perhaps
> breaking on non-alphabetic characters. That should do two things: 1)
> get you somthing that resembles words to actually test and 2) somewhat
> smaller set of "stuff" to check.
> This won't necessarily handle "compound" words though where two
> word-like things are jammed together, or an actual word is embedded
> within a string of nonsense.
> I think this problem is potentially rather harder than I thought when
> I saw OP's original question.

It does not need to be comprehensive. Would it be possible to only
show lines that have "words" (continuous strings) of alpha characters
that are all lowercase except for the first character? That would
handle about 90% of the work by eliminating lines line these:

Dotan Cohen


Reply to: