Re: POSIX regular expressions (was: autodetecting MBR location)
Tollef Fog Heen <tollef@add.no> writes:
> * (Colin Watson)
>
> | >For a non-POSIX regex, that is.
> |
> | Could you point me to some documentation about this? regex(7) claims
> to | describe POSIX 1003.2 regular expressions, and describes
> leftmost-first | behaviour.
>
> Hmm. Strange. Mastering Regular Expressions by O'Reilly has
> something about this, where they claim otherwise. I don't have the
> POSIX specification so I can check myself, though.
>
> | So is there no correct POSIX regex library in Debian?
>
> No, not if MRE is right. Which I suppose it is, but am not 100% sure
> of, as I haven't read the specs.
Well, the draft 4 (not the latest, but the latest I have around here)
for the next POSIX revision (http://www.opengroup.org/austin/)
says, in part (and there is no reason to assume this has changed
since POSIX - I think all re changes are in the area of character
classes, because no two implementations implemented those the
same way):
6246 The search for a matching sequence starts at the beginning of a string and stops when the
6247 first sequence matching the expression is found, where first is defined to mean ??begins
6248 earliest in the string??. If the pattern permits a variable number of matching characters and
6249 thus there is more than one such sequence starting at that point, the longest such sequence
6250 is matched. For example: the BRE "bb*" matches the second to fourth characters of abbbc,
6251 and the ERE (wee|week)(knights|night) matches all ten characters of weeknights.
That does not sound as if MRE were right.
Regards - Kai Henningsen
--
http://www.cats.ms
Spuentrup CTI Fon: +49 700 CALL CATS (=22 55 22 87)
Windbreede 12 Fax: +49 251 322312 99
D-48157 Muenster Mob: +49 161 322312 1
Germany GSM: +49 171 7755060
Reply to: