[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: C comment extraction, or a bag .deb of small commands



On Sun, Jan 19, 2003 at 07:38:09AM +0200, Richard Braakman wrote:
[snip]
> > 	\/\*([^*]|\*[^/])*\*\/
> 
> For what language? :)  C comments are a bit more complicated than that.
> For example, you can break a */ sequence across lines:
> 
> /* comment *\
> /

Hmm. I didn't know this before. :-)  (Gets an idea for an IOCCC entry :-P)

> Also, C99 accepts // comments, and handling them separately
> from /* */ comments does not work:
[snip]

In that case, you could just alternate between the two:

	(\/\*([^*]|\*[^/])*\*\/|\/\/[^\n]*)

I *think* this should cover all combinations of /*...*/ and //... .

But as someone pointed out, this totally doesn't handle /*'s and //'s
appearing inside quoted strings. I overlooked that aspect of it.
Nevertheless, it *must* be possible to write a regex for it, since
mathematically speaking, a finite state machine is powerful enough to
tokenize C. Of course, that doesn't say anything about how complex the
regex might have to be to cover all cases. :-)


T

-- 
The most powerful one-line C program: #include "/dev/tty" -- IOCCC



Reply to: