Re: UTF-8 in copyright files?

On Tue, Dec 16, 2003 at 07:00:26PM +0100, Sven Luther wrote:
> On Tue, Dec 16, 2003 at 05:22:36PM +0000, Colin Watson wrote:
> > On Tue, Dec 16, 2003 at 04:38:23PM +0100, Sven Luther wrote:
> > > Yep, noticed that also, man simply removes the - from manpages,
> > > which is a pain, as they get used pretty much in manpages.
> > 
> > See /etc/groff/{man,mdoc}.local. (And no, groff doesn't remove "-", it
> Ok, will look.
> > renders it as the Unicode HYPHEN character.)
> Well, i most definitively cannot see it, and i am using a UTF-8 aware
> xterm (uxterm if i am not wrong).


  printf - | groff -Tutf8 | grep .


  printf - | groff -Tutf8 | grep . | od -tx1

I've seen a similar report before, but IIRC it turned out to be a
terminal/font problem.

> > > Then, do not say that we should allow UTF-8 usage in files, if it is
> > > clearly not going to be supported. Come one, sarge+1 is probably more
> > > than 2 years away, if we cannot support UTF-8 by then, something is
> > > seriously wrong.
> > 
> > What I'm saying is that we depend on upstreams to make significant
> > design changes, and that's something we have little control over, so it
> > shouldn't be a release goal as such. Some changes can be made in Debian,
> > but I'm not sure radical design shifts with compatibility implications
> > qualify.
> It would assuredly not be the first time that we fork upstream to fit
> our goals and needs.

I'm not sure you read what I wrote about major design and compatibility
changes ...

Also, to continue the example, not only am I unwilling to make an
incompatible change to groff in Debian but I'm not even remotely capable
of making such a massively sweeping change. I think you might
underestimate how much work it can be to alter older but widely-used
programs with just-send-8-bit assumptions to use UTF-8. I'm sure
maintainers of other such programs would feel the same way. This kind of
work can and should be done upstream.

That's why I suggested making it a "should": I'm quite happy to
acknowledge that it's a bug that groff doesn't support UTF-8 input, but
I think it would be pointless to hold up a release because of that.


