Bug#181872: Patch
On Thu, Mar 13, 2003 at 10:12:14PM +0100, Josip Rodin wrote:
> On Thu, Mar 13, 2003 at 09:27:28PM +0100, Frank Lichtenheld wrote:
> > Ok. Let's elaborate a little. Sorry if it's too long.
>
> Oh, I understood perfectly what you said, I just meant to say that I thought
> the original code preserved URL: within the <a> tag by mistake.
Ok. But it is not within the <a> tag, it only converts the
<URL:http://...> to <URL:http://...> The regex that converts the
http://.. to <a href="http//...">http://...</a> is this one one line
below [$long_desc =~ s,(http://[\S~-]+?/?)((\>\;)?[)]?[']?[.\,]?(\s|$)),<a href=\"$1\">$1</a>$2,go;]
After all the discussions I would propose as the patch to apply (it
contains elements of both versions):
$long_desc =~ s,<((URL:)?\s*http://[^>]+)\s*>,\<\;$1\>\;,go;
In the end it's your decision.
> Well, I think in principle it's much better to just match until the first
> closing bracket since IME such things are less prone to errors. Of course,
> if someone found URLs with <> in them, that idea goes down the drain...
Ok, let's wait for a package maintainer to do this. Then we can handle
it ;)
> > But if you want to really allow this you have to write something like:
> > $long_desc =~ s/\&(?!(?:#x?[\da-fA-F]+|\w+)\;)/\&\;/go;
> > Seems to work good but no warranty. Happy regexing ;)
>
> Not sure offhand why you both check the entity format and use a rather
> simple \w+ as an alternative... A sentence could end talking about
> Barnes&Noble; and then it could be followed by another sentence :)
Hmmm, see the problem. Only solution seems to be to make a list of
allowed entities:
$long_desc =~ s/\&(?!(?:#x?[\da-fA-F]+|amp|gt|lt|quot)\;)/\&\;/go;
Greetings,
Frank
--
*** Frank Lichtenheld <frank@lichtenheld.de> ***
*** http://www.djpig.de/ ***
see also: - http://www.usta.de/
- http://fachschaft.physik.uni-karlsruhe.de/
Reply to: