[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Should nonbreakable space belong to whitespace class?



Denis Barbier wrote:
Miroslav Kure wrote:

Unfortunately, nonbreakable space is not included in character class \s or [:space:] (aka whitespace). As it is usually not distinguishable from the ordinary space in most of the fonts, I would say that nonbreakable space should be added to the whitespace class in regexp libraries.

No, that would defeat its purpose; a non-breaking space is used to glue two words together.

But only in the graphical sense. In the logical sense they are still two separate words.

Isn't the real problem that Miroslav should have used '\b' to identify the boundary between word and non-word text or '[:^word:]' to identify all non-word characters. Ideally this should even catch the invisible word separators used in some cases in some languages. This only has one problem; the soft hyphen is for some reason not classified as a word character (in da_DK and fo_FO), which it logically should.

Jacob
--
"... there may be many others,
 but they haven't been discovered"             -- Tom Lehrer



Reply to: