[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Proposed release goal: UTF-8 support



On 2013-10-01 00:54, Adam Borowski wrote:
> Hi!
> 
> As previously (https://lists.debian.org/debian-devel/2013/08/msg00217.html)
> discussed, I'd like to propose improving support for UTF-8.  All material
> shipped with Debian should be encoded this way, or, for media with an opaque
> format, as something capable of storing any Unicode character, without need
> for any extra steps.
> 


Hi,

Thanks for your interest in improving Debian Jessie.  I appreciate the
idea behind this goal, but I have a few concerns with some of the sub-goals.

> There are four sub-goals:
> 
> * user-accessible interfaces (GUI, stdin, stdout, stderr, command line,
>   reading/writing plain-text files) should be able to pass through UTF-8
>   data uncorrupted, by default

Do we have a number of GUIs/tools that needs update here?  If not, I am
affraid this fails the "Measurable" requirement.  Even then, I doubt it
is realistic to assume there will be time to test all interfaces, so
perhaps put a limit on for Jessie (e.g.  "All interfaces of/in
$classification")

> * UTF-8 should be properly displayed

Can you be a bit more specific here?  Is this still for "usre-accessible
interfaces", then my concerns from above also apply here.

> * all filenames in Debian packages, binary and source, must use UTF-8
>   (obviously, 7-bit ASCII satisfies this requirement)

Okay, thanks to the Lintian check, I think we can say this satisfies the
SMART requirements at least for binary packages (the check is not done
on source packages as I recall)[1].

> * all text files shipped by Debian should be encoded in UTF-8
> 

I suspect this is suffering from the same problems as the
"user-accessible interface".  Though I suspect most of the problematic
files should be easier to detect then a bug in a user-interface (and
re-encoding is probably easier than fixing bugs in user-interfaces).

For this particular sub-goal, it is also possible to reduce the scope by
file type (e.g. "For Jessie, we will fix all POD,$A,... and $N files").

> The wiki entry: https://wiki.debian.org/ReleaseGoals/utf-8
> 

By the way, this link does not appear to be included on
https://wiki.debian.org/ReleaseGoals.

Once again, I like the idea behind the goal.  However, I would like to
see some numbers of how much work there is (or how much work we/you
decide to target for Jessie) to ensure this goal is Attainable.

~Niels

[1] http://lintian.debian.org/tags/file-name-is-not-valid-UTF-8.html

Lists 5 packages and 15 affected files.

Technically, there is also:
http://lintian.debian.org/tags/file-name-in-PATH-is-not-ASCII.html

But there are no instances of that.



Reply to: