[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#396295: openssh-server: scp copies filenames with extended characters in the wrong character set, or sshd does not know about local locale for filenames



severity 396295 wishlist
thanks

On Tue, Oct 31, 2006 at 02:08:54AM +0200, Wouter Van Hemel wrote:
> Package: openssh-server
> Version: 1:4.3p2-5.1
> Severity: minor
> 
> If you copy files from one system to another with scp, the filenames don't 
> always end up in the expected character set. For instance, when files are 
> copied from a iso-8859-(1/15) locale machine to a machine with UTF-8 
> locale, the filenames show up unreadable to local console and X programs.
> 
> The openssh server probably needs to be aware of the local locale; but 
> that will probably not solve the case where one user's locale differs from 
> the system locale, unless the character sets happen to be rather compatible.
> 
> Am I missing an easy way to make sure filesnames are stored with local 
> locale settings (in non-interactive login sessions such as scp)?

Filenames historically haven't really had an encoding in Unix; they're
just byte strings that may or may not happen even to be representable in
your current locale. Messing with this in scp would open all sorts of
interesting cans of worms (what happens if the filename isn't
representable in the target encoding? what if you normally set your
locale only for interactive login sessions, so the locale apparent to
scp will be different? etc.). Dealing with this is exactly the sort of
thing that isn't going to be done in scp, which cannot be significantly
extended (http://www.openssh.org/faq.html#2.10).

It may be possible to extend sftp to do this. If
draft-ietf-secsh-filexfer ever gets out of the Internet-Draft status, it
will provide a mechanism for filename translation (see section 6 of
ftp://ftp.nordu.net/internet-drafts/draft-ietf-secsh-filexfer-13.txt).
However, doing this in advance of the standards process would cause
interoperability problems.

In the meantime, I suggest recoding the filenames after the fact. A
rename(1) script could do the job, such as:

  rename 'use Encode; Encode::from_to($_, "iso-8859-1", "utf-8")'

Cheers,

-- 
Colin Watson                                       [cjwatson@debian.org]




Reply to: