[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#47366: " and   aren't converted, probably should be



On Thu, 14 Oct 1999, David Coe wrote:

> Package: unhtml
> Version: 2.2-2
> Severity: normal
> 
> unhtml this:
> 
> <html>
> <body>
> <p>This &quot;thing&quot; doesn't work.&nbsp;&nbsp;Does that one?
> </body>
> </html>
> 
> I believe it should convert each &quot; to " and each &nbsp; to space,
> but it leaves them as is and the output is quite ugly.  
> 
> Probably an upstream bug.  Or is it a feature?  Thanks. 
>
As i understand the underlying C code, unhtml is still far too generic
to handle these cases because it exclusively checks for tag pairs like
"<SCRIPT></SCRIPT>" in a case insensitive manne.  It was obviously
never designed to convert HTML entities into their proper text
counterpart and would have to be expanded with such capabilities by
some gifted C programmer (which i am unfortunately not).

I'll forward this very message to the upstream author and to the
Debian development mailing list to ask for some willing C programmer
to take over maintenance of this package in the hope that unhtml would
be further enhanced.
                                Thank you, P. *8^)
-- 
   --------- Paul Seelig <pseelig@goofy.zdv.uni-mainz.de> -----------
   African Music Archive - Institute for Ethnology and Africa Studies
   Johannes Gutenberg-University   -  Forum 6  -  55099 Mainz/Germany
   ------------------- http://ntama.uni-mainz.de --------------------


Reply to: