Bug#181872: Patch
On Thu, Mar 13, 2003 at 07:58:05PM +0100, Josip Rodin wrote:
> On Thu, Mar 13, 2003 at 06:46:35PM +0100, Frank Lichtenheld wrote:
> > > The right fix would be simply
> > > $long_desc =~ s,<(?:URL:\s*)?(http://[^>]+)\s*>,\<\;$1\>\;,go;
> > >
> > > Right?
> >
> > Yours would do also. The main difference in result is that you delete
> > the 'URL:' while mine preserves it. Only a cosmetic difference.
>
> Actually I did that off the top of my head, focusing on the [^>] part.
> I thought that the "URL:" part was included in the anchor, but I guess
> that's handled by some other part of the code.
Ok. Let's elaborate a little. Sorry if it's too long.
$long_desc =~ s,<((URL:)?http://[\S~-]+?/?)>,\<\;$1\>\;,go;
^^ ^ ^ ^
12 2 X 1
That's the original regex. Included is the first match and so all
that's matched beetween '(' 1 and ')' 1. The problem in the bug was
that at point X was no whitespace allowed, so I inserted \s* at this place.
$long_desc =~ s,<((URL:)?\s*http://[\S~-]+?/?)>,\<\;$1\>\;,go;
^^^
In your regex
$long_desc =~ s,<(?:URL:\s*)?(http://[^>]+)\s*>,\<\;$1\>\;,go;
^ ^ ^ ^ ^
1 1 2 2 Y
only what's beetween '(' 2 and ')' 2 is included (because of the '?:'
modifier in the first parantheses). So the 'URL:' is discarded.
Wether you write [\S~-]+?> or [^>]+> should make no big difference (you
are allowing more chars), especially because the first one is a
non-greedy match. The \s* at Y is a good addition by you.
> > > > + $long_desc =~ s/\&/\&\;/go;
> The problem is that if someone puts a proper & in a URL, your regexp
> would happily convert it to &amp; :)
But why would someone do this? The main place where a long description
is displayed is a package manager (dselect/aptitude) not a website. I
would consider this a bug in the package, not in the code.
But if you want to really allow this you have to write something like:
$long_desc =~ s/\&(?!(?:#x?[\da-fA-F]+|\w+)\;)/\&\;/go;
Seems to work good but no warranty. Happy regexing ;)
Greetings,
Frank
--
*** Frank Lichtenheld <frank@lichtenheld.de> ***
*** http://www.djpig.de/ ***
see also: - http://www.usta.de/
- http://fachschaft.physik.uni-karlsruhe.de/
Reply to: