[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Man pages and UTF-8



On Fri, Sep 14, 2007 at 10:39:10AM +0100, Colin Watson wrote:
> On Wed, Sep 12, 2007 at 02:25:26AM +0200, Adam Borowski wrote:
> > On Tue, Sep 11, 2007 at 09:55:44AM +0100, Colin Watson wrote:
> > > Is this what your "hack" pipeline implements? If so, I'd love to see it;
> > > if not, I'm happy to implement it.
> > 
> > The prototype is:
> >                               pipeline_command_args (p, "perl", "-CO", "-e",
> >                                               "use Encode;"
> >                                               "undef $/;"  
> >                                               "$_=<STDIN>;"
> >                                               "eval{print decode('utf-8',$_,1)};"
> >                                               "print decode($ARGV[0],$_) if $@",
> >                                               page_encoding,
> >                                               NULL);
> > so it's similar.  "Slurp everything into core" in C is a page of code, your
> > idea of a static buffer makes it simpler; and I'm not in a position to
> > complain that it's another hack :p
> 
> Current man-db makes the buffering pretty trivial:
> 
>   const char *buf = pipeline_peek (p, 65536);
> 
> I'll try to implement something like this in C, then.

I've now done this. The code is in bzr here and fully integrated into
all the man-db programs that care about encodings:

  http://www.chiark.greenend.org.uk/~cjwatson/bzr/man-db/trunk/

Cheers,

-- 
Colin Watson                                       [cjwatson@debian.org]



Reply to: