[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: 8bit characters in files in Debian packages



On Thu, Mar 31, 2011 at 07:05:10PM +0200, Raphael Hertzog wrote:
> On Thu, 31 Mar 2011, Bill Allombert wrote:
> > So this raises two issues:
> > 1) should non-7bit characters in filenames be allowed
> 
> Yes, I don't see a good reason to forbid them. In particular when we are
> in an international environment and we are targetting full localization.

Agreed.  As mentioned in point (2), I don't think we should permit
arbitrary 8-bit encodings in packaged file names.

The problem here is that we can't tell what the encoding is
automatically.  It's only legible (if not valid) in the specific
locale which matches the encoding used.  For all other locales,
the filename is just unusable, and may well even break tools
due to this.

I think we need to draw a line between what users are permitted
to do here, and what the "Debian system" is permitted to do.

We should not prevent users from using the 8-bit encoding of their
choice for filenames, but I don't think we should allow this in
packages, where having a consistent encoding for everything is
rather desirable.

> > 2) if yes whould we require the filename to be in a correct UTF-8 encoding ?
> 
> I think it would be good, yes. We have standardized on UTF-8 for almost
> everything and we should do the same for filenames.
> 
> Is there no lintian check covering this?

This would be a good lintian check, IMO.  Additionally, the test is
clear and unambiguous, so could be made a fatal error to prevent
such packages being uploaded.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

Attachment: signature.asc
Description: Digital signature


Reply to: