[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#612675: libkio5: KTar class have broken UTF-8 support (longlink)



	Hi!

 First of all, sorry for late response. The bug got overlooked in the
aftermath of the squeeze release, people were quite busy here.

* Rinat <ibragimovrinat@mail.ru> [2011-02-09 23:16:22 CET]:
> First, tar archives have to use "longlink trick" to 
> store names longer than 100 bytes. KTar class has 
> functions implementing longlink, but they check name 
> length in _characters_, not in bytes. For non-ASCII 
> characters in UTF-8 length of string in bytes and 
> length in characters do not match. In my case file 
> had character-length less than 100 and byte-length 
> greater than 100, so name simply truncated. Such 
> behavior can be observed on non-ASCII UTF-8 or any 
> other multibyte encoding. If file name is very long,
> resulting .tar may become unreadable.

 Thanks for digging that up, from reading the diff it's clear that this
is a mistake and should get addressed.

> Second, calculation of 'chksum' field of tar header also
> broken: 'buffer' array defined as char, a signed number,
> while in tar sources chksum obtained as sum of unsigned
> values (actually there is the trick for (unsigned char)
> emulate, converting to integer and then logical and with
> 0xFF). May be bad checksum was reason for unreadable .tar.

 This though is not totally clear to me. On the major architectures,
char is signed, so I would assume that a chksum error in this area
should have hit a lot of people already? Given that int is signed by
default I wonder if this is the proper approach and it shouldn't rather
be cast to signed char (signedness of char varies across the different
architectures).

 Out of curiosity, you filed this from an i386 system. Did you maybe
copy around the backup from/to any architcture including arm, armel,
powerpc or s390? Were they somehow involved in the assumingly checksum
error of yours? The thing behind the question is: If we "fix" the
calculation in the direction that you propose, this would break backups
done now on the architectures that do have char signed by default
because it would result in a different checksum.

 Or do I have any mistake in my thinking here?

 Thanks,
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |



Reply to: