Bug#535702: Faulty encoding/decoding of filenames
Package: dolphin
Version: 4:4.2.2-1
I have a disk mounted that uses a few ISO8859-1-encoded filenames while the
rest of my system is using UTF-8. Dolphin fails to handle files that are not
ASCII (the common subset) in such a setup.
Two cases happened to me:
1. I can't browse into a directory with umlauts. Dolphin displays the dir with
a questionmark in place of the umlaut but doesn't allow you to click on it in
order to browse into it.
2. I can't even rename the directory. Probably just a different aspect of the
same problem, Dolphin complains that the file can't be found.
The weird part is that the file it claims it can't find has very little in
common with the one on the disk.
Example: Mission_erfüllt.ogg is the file on disk, encoded with ISO8859-1 that
makes it "Mission_erf\xfcllt.ogg". Now, when I try to rename the file, Dolphin
claims it can't find "Mission_erf�llt.ogg", which would be
"Mission_erf\xef\xbf\xbdllt.ogg". If I decode these three bytes according to
UTF-8, they form the codepoint ufffd, which is a "replacement character"[1],
probably inserted because the filename couldn't be decoded according to the
current locale. What must be done is to preserve the bytewise representation
of the filename. In order to display it, it can try to transcode it and do
replacements there, but for accessing the name, e.g. for renaming, it must not
use a filename resulting from this lossing encoding roundtrip.
Please don't suggest to me that I should fix my locale, mount the disk with a
different encoding or similar things. Those are are good ideas (if it wasn't
for pluggable media) but no excuses for Dolphin performing lossy roundtrip
conversions on data it doesn't understand.
Uli
[1] Quoting from http://www.unicode.org/charts/PDF/UFFF0.pdf:
FFFD REPLACEMENT CHARACTER
* used to replace an incoming character whose value
is unknown or unrepresentable in Unicode
Reply to: