Re: Man pages and UTF-8
On Fri, Sep 14, 2007 at 10:39:10AM +0100, Colin Watson wrote:
> On Wed, Sep 12, 2007 at 02:25:26AM +0200, Adam Borowski wrote:
> > On Tue, Sep 11, 2007 at 09:55:44AM +0100, Colin Watson wrote:
> > > Is this what your "hack" pipeline implements? If so, I'd love to see it;
> > > if not, I'm happy to implement it.
> >
> > The prototype is:
> > pipeline_command_args (p, "perl", "-CO", "-e",
> > "use Encode;"
> > "undef $/;"
> > "$_=<STDIN>;"
> > "eval{print decode('utf-8',$_,1)};"
> > "print decode($ARGV[0],$_) if $@",
> > page_encoding,
> > NULL);
> > so it's similar. "Slurp everything into core" in C is a page of code, your
> > idea of a static buffer makes it simpler; and I'm not in a position to
> > complain that it's another hack :p
>
> Current man-db makes the buffering pretty trivial:
>
> const char *buf = pipeline_peek (p, 65536);
>
> I'll try to implement something like this in C, then.
I've now done this. The code is in bzr here and fully integrated into
all the man-db programs that care about encodings:
http://www.chiark.greenend.org.uk/~cjwatson/bzr/man-db/trunk/
Cheers,
--
Colin Watson [cjwatson@debian.org]
Reply to: