On Fri, 2006-03-03 18:54:30 +0100, Klaus Ade Johnstad <klaus@skolelinux.no> wrote: > fredag 3. mars 2006, 07:48, skrev Klaus Ade Johnstad: > I found out that if I use in smb.conf > unix charset = cp850 > display charset = cp850 > > Then all the German and Norwegian special characters looks "fine" again That basically means that your Linux box, too, uses cp850 (or something alike) as it's local Umlaut representation, and you've loaded a working font and mapping for that. > I'm not sure if this is "a smart thing", but I've not been able to get It is enough if it solves your problem, but it's not a general solution, eg. won't work if you ever need to support some more fancy Umlauts. > this result using the different methods with "iconv -f cp850 -f utf-8" > or "convmv -f cp850 -t utf8". First understand the stack in which filenames are saved and seen: Lets use an example, the German's sharp-s, "ß". Looking at the console, typing a "ß", you'll produce 0xdf (in iso-8859-1) or 0xc39f (in UTF-8). Notice that in UTF-8, this is two bytes, which the console driver displays as _one_ glyph on your monitor. If this is given as a filename to the VFS API, the VFS will usually save it as-is. (There are rare examples where the FS driver _forces_ a specific internal representation and thus, it may convert the filename on it's own, like ntfs, which generally uses a two-byte representation.) So now we've got a filename with 0xc39f in it; if the console is setup to use unicode/UTF-8, that'll view okay. If it is configured to use eg. iso8859-1, you'll see two wrong chars (because these encodings are purely one-byte encodings.) Now Samba steps in. Since SMB (in all newer protocol variants, ignoring traditional Lanman here) uses the same always-two-bytes representation as NTFS does (wonder, wonder), Samba needs to convert from whatever the filename physically contains to this two-byte encoding. This is why Samba needs to be told about the actually used charset. (And for Lanman clients, Samba will try to convert the NTFS-like two-byte encoding into a single-byte charset encoding, too.) > I suspect that the "correct" way of dealing with this is by using > convvm/iconv, but I haven't managed that yet. You're now basically back to a very simple DOS-like approach. That may be enough for your tasks. MfG, JBG -- Jan-Benedict Glaw jbglaw@lug-owl.de . +49-172-7608481 _ O _ "Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg _ _ O für einen Freien Staat voll Freier Bürger" | im Internet! | im Irak! O O O ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));
Attachment:
signature.asc
Description: Digital signature