[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#795955: Complex regular subexpression recursion limit in cruft.pm



On 2015-08-30 00:02, Bastien ROUCARIES wrote:
> On Sat, Aug 29, 2015 at 10:31 PM, David Prévot <taffit@debian.org> wrote:
>> [...]
>>
>>> I get quite some warnings from perl:
>>> Complex regular subexpression recursion limit (32766) exceeded at /usr/share/lintian/checks/cruft.pm line 939.
> 
> it is the remove comments regex from perlfaq ....
>    # from perl faq strip comments
>     $strip
>       =~ s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?(?=\n)|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined
> $3 ? $3 : ""#gse;
> 
> How can I solve this ?
> 
> Bastien
> 

It is not certain we can solve it in pure Perl.  The Perl "Regex" engine
(like many other languages's) is not a regular expression.  This means
we can exponential runtime for expressions, where a true regular
expression could do it in polynomial time[1].

We can try to help Perl by reducing the points where this can do a
backtrack (By using *+, where possibly to forbid backtracking).
Alternatively, we can reduce the input size.  But if that does not work,
we will have to remove the regex or replace it with a simpler one.
  Though, be careful with *+ (and ?+) as it can subtly change the regex
to no longer match where the old one could.

Anyway, be careful in general with adding any non-trivial regex.

Thanks,
~Niels

[1] https://swtch.com/~rsc/regexp/regexp1.html

Do note that the graph for Thompson NFA is measured in "us", while the
perl graph is in "s".



Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: