[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Introducing codesearch.debian.net, a regexp code search engine

On Tue, 06 Nov 2012 21:22:17 +0100
Michael Stapelberg <stapelberg@debian.org> wrote:

> > Another important step would be a way of excluding matches
> > within comments from the results.
> I have considered this, but when you think about it, identifiers
> (variable names, function names, …) and comments are really are there is
> searchable in source code. Could you give me a few convincing points on
> why it would be useful to exclude comments (that is, examples)?

Any search term which can be a variable name and frequently occurs in
licence headers or doxygen markup or email addresses (copyright).

(I dread to think what results come from searching just for 'debian',
even with filetype:c it's all licence headers / email addresses.)


Any similar term which is frequently used across doxygen-style API docs
will give a mix of comments and code.

That's just swamped by licences, as would be received and lots of other
common words (which are, rightly or wrongly, used as variable names or
as part of function names).

Without exclusions on comments (and without fixes for filetype: matches
below) then any common word is going to be swamped.

> > The filetype seems a little confused in places too. Searching for
> > things in filetype:perl I get matches in debian/control and
> > debian/copyright.
> Can you give me the exact query for which this happens, please?


filetype:perl just doesn't seem to be working:
... lists a lot of .c files ...

filetype:python does the same - some .py but then a lot more .c


Neil Williams

Attachment: pgpeOumsCBXH_.pgp
Description: PGP signature

Reply to: