[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#500540: kdebase: automounting vfat (partialy) case sensitive due to utf8 is plain wrong and dangerous



On Fri, Oct 10, 2008 at 11:31:27PM +0300, Teemu Likonen wrote:
> Heinrich Langos <henrik-debian-bug@prak.org> writes:
> 
> > Now lets try again with more sane vfat options:
> >
> > # mount | grep vfat
> > /dev/sda1 on /mnt type vfat (rw,nosuid,nodev,noatime,uhelper=hal,flush,uid=1000,shortname=lower,check=relaxed,codepage=850,iocharset=iso8859-1)
> > "shortname=lower" should be used but don't take my word for it.
> 
> "shortname=mixed" works nicely with "utf8" flag, and command
>     touch test Test teSt
> touches the same file three times.

The main question was if it preserves the case if you do

touch TeSt test TEST

In my setting ("shortname=lower") the resulting file was "test". 
In your case it should be "TeSt" but opening "test" should also work. 

The problem with case sensitive mounting and short names is that the
application doesn't know which "shortname" option was used. So if my 
program writes "/media/ipod/foo.mp3" and you used "shortname=win95"
than the name on the filesystem will be FOO.MP3

When you mount that filesystem with iocharset=utf8 then my program WILL break 
as it is not be able to open the file it wrote.

gnupod for example uses short filenames explicitly to avoid other problems 
and DOES definetly break in such a situation. I've read of other music 
management software that does the same thing.

Long filenames are a different beast altogether...

> > And who came up with the idea to mount vfat with utf8 anyway? It was
> > never designed to take short utf8 names. Those are strictly 8.3 and if
> > you try to stick utf8 characters in there, you get all kinds of length
> > checking problems. Long names on vfat are stored in unicode anyway. So
> > whats the big gain here? For the sake of squeezing utf8 into places
> > where it never was ment to be, we get messed up filesystems?
> 
> I admit that some of the ideas may have come from me. I have described
> one aspect of this issue in the kernel bug #417324:
> 
>     http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=417324
> 
> I'm pretty confused with all these "iocharset", "codepage" and "utf8"
> flags but I'm certain that in Debian Etch (and its default locale
> settings [UTF-8] and kernel settings) filenames get converted totally
> wrong.
> 
> Long filenames in FAT filesystem are in the form we call UTF-16 today.
> In default Etch system FAT's UTF-16 filenames get converted to
> ISO-8859-1 if the filesystem is not mounted with "utf8" flag. The other
> direction is so that Etch's UTF-8 filenames are assumed to be in
> ISO-8859-1 and, since it's a single-byte encoding, every byte (even in a
> UTF-8 multibyte character) gets converted separately to UTF-16. This
> produces complete garbage of course.
>  
> KDE is nice enough to use "utf8" flag but someone reported that Gnome
> does not (or at least did not) mount with this flag. Thus it produces
> filenames which are unreadable in other systems (including MS Windows).
> 
> I guess the change in kernel settings made you see this issue after
> upgrading from Etch to Lenny. The option CONFIG_FAT_DEFAULT_IOCHARSET
> was changed from "iso-8859-1" to "utf8".

Ok, so in order to fix a gnome bug you broke everything else? :-)

> > As far as I have seen in archives and related bug reports the blame
> > for this problem gets shifted around from KDE to pmount to the kernel
> > itself and all the way back. Everybody happily points fingers at the
> > others.
> 
> This seems to be pretty complicated. We have to make
> 
>     VFAT/UTF-16  <-->  Debian/UTF-8
> 
> conversion work somehow, and in Etch it does not work (except when KDE
> does the mounting). In Lenny the conversion currently works by default
> with just "mount" without any options; this is because the change in the
> kernel settings.
> 
> But then there's the issue you reported... :-( In my experience
> "shortname=mixed" works nicely without character case problems.


Please read this: http://www.nslu2-linux.org/wiki/HowTo/MountFATFileSystems
If you need a short explaination of ther vfat mount options codepage,
iocharset and utf8.

Let me Quote:
> When the utf8 flag is specified along with iocharset the iocharset value 
> only controls the character case handling - it has no effect on the encoding 
> of the UNICODE characters as this will always use UTF-8.

So, according to it the "right thing" would be to select the iocharset 
according to your language AND specifying the utf8 flag. So in my case
mount ... iocharset=iso8859-1,utf8

If gnome doesn't do it right, please go ahead and fix gnome but leave the 
kernel default iocharset as it was.


Unfortunately the "right thing" can't be done by default:
Let me quote once more:
> The utf8 flag cannot be specified by default - it must be given as an 
> explicit argument to mount

I am not sure if that is still the case with the current kernel. Maybe 
somebody grew tired of this limitation and fixed it. Could you check that?
I'm drowning in work... could you also add a bts reference to 417324 to 
show the dependency and reopen that bug?


 
> > -henrik
> > (Using Debian since buzz.)
> 
> Wow, I'm from the Woody/Sarge generation. :-)

Yeah, I should get a medal, shoudn't I ? :-)




Reply to: