Bug#798300: marked as done (linux: fs/isofs/rock.c coarsely truncates file names of 254 or 255 bytes length)
Your message dated Sun, 9 May 2021 14:45:45 +0200
with message-id <YJfZeUZPUtHs/LNQ@eldamar.lan>
and subject line Re: Bug#798300: linux: fs/isofs/rock.c coarsely truncates file names of 254 or 255 bytes length
has caused the Debian Bug report #798300,
regarding linux: fs/isofs/rock.c coarsely truncates file names of 254 or 255 bytes length
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)
--
798300: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=798300
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
- To: submit@bugs.debian.org
- Subject: linux: fs/isofs/rock.c coarsely truncates file names of 254 or 255 bytes length
- From: "Thomas Schmitt" <scdbackup@gmx.net>
- Date: Mon, 07 Sep 2015 22:18:34 +0200
- Message-id: <738555842256684705@scdbackup.webframe.org>
Source: linux
Version: 4.1.6
Severity: normal
Dear Maintainer,
I experience coarse name truncation with very long Rock Ridge
filenames in mounted ISO 9660 filesystems. It happens with
names of 255 or 254 characters. 253 is ok.
Now i riddle about the reasoning behind constant 254
in fs/isofs/rock.c function get_rock_ridge_filename() line 270.
Is there a limit in the data structures which would forbid
filenames of 254 or 255 bytes length ?
Affected are at least Debian kernels 4.1.0-2-amd64 and
3.16.0-4-amd64 but probably it reaches far more back.
I seemingly fixed the problem for length 254 and 255 by changing
254 to 256 and building a new kernel. VM makes courageous.
--------------------------------------------------------------
Debian's
/usr/src/linux-4.1.6/fs/isofs/rock.c
and also
https://github.com/torvalds/linux/blob/master/fs/isofs/rock.c
function get_rock_ridge_filename(), line 270 say
if ((strlen(retname) + rr->len - 5) >= 254) {
truncate = 1;
break;
}
The value 1 in "truncate" causes interpretation of NM
entries to end prematurely. "break" prevents the current
NM name piece from being added to the result.
Different distribution of the name over NM entries causes
different effective truncation lengths.
It depends on ECMA-119 name and the SUSP data in the ECMA-119
directory record how much of the name fits into the first NM,
and how much must go to the second NM in Continuation Area.
This second NM gets dropped.
Further it seems that truncation without any attempt to generate
a unique filename is suboptimal for the further processing of
such names.
In the times of UTF-8, truncation would have to show
some additional intelligence to avoid incomplete multi-byte
characters.
--------------------------------------------------------------
To reproduce the problem, create an ISO with a Rock Ridge
filename of at least 254 bytes length
$ touch my_dummy_file
$ genisoimage -R -o test.iso -graft-points /12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234=my_dummy_file
Then mount it and list the file
# mount -o loop test.iso /mnt/iso
$ ls /mnt/iso
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
That's 146 characters only.
Filenames of length 245 or 255 with same 146 first characters
all get mapped to the same name. readdir(3) delivers several
dirents with the same .dname.
Different production yields different truncations.
$ xorriso -as mkisofs ...same.options.as.above...
yields after mount 155 characters.
$ xorriso -outdev /dvdbuffer/test.iso -blank as_needed \
-map my_dummy_file /...254.characters...
yields 93 characters.
xorriso as alternative reader shows in all three cases 254
characters.
--------------------------------------------------------------
Investigating the possible impact of a change from 254 to 256
in rock.c
-------------------------------------------------------------
dir.c:
static do_isofs_readdir() calls get_rock_ridge_filename()
and hands over the result to dir_emit(), which i understand from
include/linux/fs.h is a "filldir" function type, by which
the kernel receives dirent information.
Such a receiver and its consumers should be good for 255
bytes plus 0, shouldn't it ?
The name buffer tmpname is a parameter handed over by the only
caller isofs_readdir(). It obtains the buffer by
tmpname = (char *)__get_free_page(GFP_KERNEL);
"struct page" sends me on a log journey into regions of the
memory management which i do not understand.
Nevertheless a widely used object named "page" that can take
254 bytes should be able to take 256 bytes, shouldn't it ?.
This buffer is not used after the call of do_isofs_readdir()
but rather gets freed immediately.
-------------------------------------------------------------
namei.c:
static isofs_find_entry() uses the result of get_rock_ridge_filename()
as argument of static isofs_cmp(). I believe to trace it through
struct dentry_operations to isofs_dentry_cmp_common() in
fs/isofs/inode.c. Its use of struct qstr *name seems harmless
with a potential name->len == 255.
The name buffer tmpname is a parameter handed over by the only
caller isofs_lookup():
struct page *page;
page = alloc_page(GFP_USER);
The parameter tmpname is
page_address(page)
As above: a "page" that can take 254 bytes should be able
to take 256 bytes.
This buffer is not used after the call of isofs_find_entry()
but rather gets freed immediately.
-------------------------------------------------------------
So s/254/256/ _should_ be safe ... if i am not mistaken.
On my Debian "unstable" VM i worked my way through
http://kernel-handbook.alioth.debian.org/ch-common-tasks.html
(mainly paragraph 4.5) and did the smallest possible change
from "4" to "6".
The VM is now running with the changed kernel and shows
filenames up to 255 (US-ASCII) characters length.
I gave it a stress test by taring the content of a 3.2 GB
backup ISO with 67000 files, stemming from 20 years of
linuxing. Not really a challenge with exotic characters,
but it has the long names (for a HFS+ mangling test).
genisoimage is obviously willing to write Rock Ridge
names longer than 255 characters.
This gave a good opportunity to verify that the kernel
still prevents file names larger than MAX_SIZE.
(If this is mandatory at all ...)
(xorriso or libisofs silently truncate the filename to 255
bytes. As their upstream maintainer i will have to investigate
whether it's UTF-8-safe and why there is no warning.)
-------------------------------------------------------------
Have a nice day :)
Thomas
-- System Information:
Debian Release: stretch/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 4.1.6 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
--- End Message ---
--- Begin Message ---
- To: Thomas Schmitt <scdbackup@gmx.net>, 798300-done@bugs.debian.org
- Subject: Re: Bug#798300: linux: fs/isofs/rock.c coarsely truncates file names of 254 or 255 bytes length
- From: Salvatore Bonaccorso <carnil@debian.org>
- Date: Sun, 9 May 2021 14:45:45 +0200
- Message-id: <YJfZeUZPUtHs/LNQ@eldamar.lan>
- In-reply-to: <738555842256684705@scdbackup.webframe.org>
- References: <738555842256684705@scdbackup.webframe.org>
hi,
On Mon, Sep 07, 2015 at 10:18:34PM +0200, Thomas Schmitt wrote:
>
> Source: linux
> Version: 4.1.6
> Severity: normal
>
> Dear Maintainer,
>
> I experience coarse name truncation with very long Rock Ridge
> filenames in mounted ISO 9660 filesystems. It happens with
> names of 255 or 254 characters. 253 is ok.
>
> Now i riddle about the reasoning behind constant 254
> in fs/isofs/rock.c function get_rock_ridge_filename() line 270.
> Is there a limit in the data structures which would forbid
> filenames of 254 or 255 bytes length ?
>
> Affected are at least Debian kernels 4.1.0-2-amd64 and
> 3.16.0-4-amd64 but probably it reaches far more back.
>
> I seemingly fixed the problem for length 254 and 255 by changing
> 254 to 256 and building a new kernel. VM makes courageous.
>
> --------------------------------------------------------------
>
> Debian's
> /usr/src/linux-4.1.6/fs/isofs/rock.c
> and also
> https://github.com/torvalds/linux/blob/master/fs/isofs/rock.c
> function get_rock_ridge_filename(), line 270 say
>
> if ((strlen(retname) + rr->len - 5) >= 254) {
> truncate = 1;
> break;
> }
>
> The value 1 in "truncate" causes interpretation of NM
> entries to end prematurely. "break" prevents the current
> NM name piece from being added to the result.
>
> Different distribution of the name over NM entries causes
> different effective truncation lengths.
> It depends on ECMA-119 name and the SUSP data in the ECMA-119
> directory record how much of the name fits into the first NM,
> and how much must go to the second NM in Continuation Area.
> This second NM gets dropped.
>
> Further it seems that truncation without any attempt to generate
> a unique filename is suboptimal for the further processing of
> such names.
>
> In the times of UTF-8, truncation would have to show
> some additional intelligence to avoid incomplete multi-byte
> characters.
>
> --------------------------------------------------------------
>
> To reproduce the problem, create an ISO with a Rock Ridge
> filename of at least 254 bytes length
>
> $ touch my_dummy_file
>
> $ genisoimage -R -o test.iso -graft-points /12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234=my_dummy_file
>
>
> Then mount it and list the file
>
> # mount -o loop test.iso /mnt/iso
>
> $ ls /mnt/iso
> 12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
>
> That's 146 characters only.
> Filenames of length 245 or 255 with same 146 first characters
> all get mapped to the same name. readdir(3) delivers several
> dirents with the same .dname.
>
> Different production yields different truncations.
>
> $ xorriso -as mkisofs ...same.options.as.above...
>
> yields after mount 155 characters.
>
> $ xorriso -outdev /dvdbuffer/test.iso -blank as_needed \
> -map my_dummy_file /...254.characters...
>
> yields 93 characters.
>
> xorriso as alternative reader shows in all three cases 254
> characters.
>
> --------------------------------------------------------------
> Investigating the possible impact of a change from 254 to 256
> in rock.c
> -------------------------------------------------------------
> dir.c:
>
> static do_isofs_readdir() calls get_rock_ridge_filename()
> and hands over the result to dir_emit(), which i understand from
> include/linux/fs.h is a "filldir" function type, by which
> the kernel receives dirent information.
> Such a receiver and its consumers should be good for 255
> bytes plus 0, shouldn't it ?
>
> The name buffer tmpname is a parameter handed over by the only
> caller isofs_readdir(). It obtains the buffer by
> tmpname = (char *)__get_free_page(GFP_KERNEL);
> "struct page" sends me on a log journey into regions of the
> memory management which i do not understand.
> Nevertheless a widely used object named "page" that can take
> 254 bytes should be able to take 256 bytes, shouldn't it ?.
>
> This buffer is not used after the call of do_isofs_readdir()
> but rather gets freed immediately.
>
> -------------------------------------------------------------
> namei.c:
>
> static isofs_find_entry() uses the result of get_rock_ridge_filename()
> as argument of static isofs_cmp(). I believe to trace it through
> struct dentry_operations to isofs_dentry_cmp_common() in
> fs/isofs/inode.c. Its use of struct qstr *name seems harmless
> with a potential name->len == 255.
>
> The name buffer tmpname is a parameter handed over by the only
> caller isofs_lookup():
> struct page *page;
> page = alloc_page(GFP_USER);
> The parameter tmpname is
> page_address(page)
> As above: a "page" that can take 254 bytes should be able
> to take 256 bytes.
>
> This buffer is not used after the call of isofs_find_entry()
> but rather gets freed immediately.
>
> -------------------------------------------------------------
>
> So s/254/256/ _should_ be safe ... if i am not mistaken.
>
> On my Debian "unstable" VM i worked my way through
> http://kernel-handbook.alioth.debian.org/ch-common-tasks.html
> (mainly paragraph 4.5) and did the smallest possible change
> from "4" to "6".
> The VM is now running with the changed kernel and shows
> filenames up to 255 (US-ASCII) characters length.
>
> I gave it a stress test by taring the content of a 3.2 GB
> backup ISO with 67000 files, stemming from 20 years of
> linuxing. Not really a challenge with exotic characters,
> but it has the long names (for a HFS+ mangling test).
>
> genisoimage is obviously willing to write Rock Ridge
> names longer than 255 characters.
> This gave a good opportunity to verify that the kernel
> still prevents file names larger than MAX_SIZE.
> (If this is mandatory at all ...)
>
> (xorriso or libisofs silently truncate the filename to 255
> bytes. As their upstream maintainer i will have to investigate
> whether it's UTF-8-safe and why there is no warning.)
>
> -------------------------------------------------------------
>
> Have a nice day :)
Thanks for the report, and sorry we never came back to you. I do not
think we will ever patch/change this in Debian, so if this still
should be trackled, please report it upstream (in such case feel free
to as well reopen the bug, and keep us in the loop accordingly).
Regards,
Salvatore
--- End Message ---
Reply to: