[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#798300: linux: fs/isofs/rock.c coarsely truncates file names of 254 or 255 bytes length



Source: linux
Version: 4.1.6
Severity: normal

Dear Maintainer,

I experience coarse name truncation with very long Rock Ridge
filenames in mounted ISO 9660 filesystems. It happens with
names of 255 or 254 characters. 253 is ok.

Now i riddle about the reasoning behind constant 254
in fs/isofs/rock.c function get_rock_ridge_filename() line 270.
Is there a limit in the data structures which would forbid
filenames of 254 or 255 bytes length ?

Affected are at least Debian kernels 4.1.0-2-amd64 and
3.16.0-4-amd64 but probably it reaches far more back.

I seemingly fixed the problem for length 254 and 255 by changing
254 to 256 and building a new kernel. VM makes courageous.

--------------------------------------------------------------

Debian's
  /usr/src/linux-4.1.6/fs/isofs/rock.c
and also
  https://github.com/torvalds/linux/blob/master/fs/isofs/rock.c
function get_rock_ridge_filename(), line 270 say

                        if ((strlen(retname) + rr->len - 5) >= 254) {
                                truncate = 1;
                                break;
                        }

The value 1 in "truncate" causes interpretation of NM
entries to end prematurely. "break" prevents the current
NM name piece from being added to the result.

Different distribution of the name over NM entries causes
different effective truncation lengths.
It depends on ECMA-119 name and the SUSP data in the ECMA-119
directory record how much of the name fits into the first NM,
and how much must go to the second NM in Continuation Area.
This second NM gets dropped.

Further it seems that truncation without any attempt to generate
a unique filename is suboptimal for the further processing of
such names.

In the times of UTF-8, truncation would have to show
some additional intelligence to avoid incomplete multi-byte
characters.

--------------------------------------------------------------

To reproduce the problem, create an ISO with a Rock Ridge
filename of at least 254 bytes length

  $ touch my_dummy_file

  $ genisoimage -R -o test.iso -graft-points /12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234=my_dummy_file


Then mount it and list the file

  # mount -o loop test.iso /mnt/iso

  $ ls /mnt/iso
  12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456

That's 146 characters only.
Filenames of length 245 or 255 with same 146 first characters
all get mapped to the same name. readdir(3) delivers several
dirents with the same .dname.

Different production yields different truncations.

  $ xorriso -as mkisofs ...same.options.as.above...

yields after mount 155 characters.

  $ xorriso -outdev /dvdbuffer/test.iso -blank as_needed \
            -map my_dummy_file /...254.characters...

yields 93 characters.

xorriso as alternative reader shows in all three cases 254
characters.

--------------------------------------------------------------
Investigating the possible impact of a change from 254 to 256
in rock.c
-------------------------------------------------------------
dir.c:

static do_isofs_readdir() calls get_rock_ridge_filename()
and hands over the result to dir_emit(), which i understand from
include/linux/fs.h is a "filldir" function type, by which
the kernel receives dirent information.
Such a receiver and its consumers should be good for 255
bytes plus 0, shouldn't it ?

The name buffer tmpname is a parameter handed over by the only
caller isofs_readdir(). It obtains the buffer by
        tmpname = (char *)__get_free_page(GFP_KERNEL);
"struct page" sends me on a log journey into regions of the
memory management which i do not understand.
Nevertheless a widely used object named "page" that can take
254 bytes should be able to take 256 bytes, shouldn't it ?.

This buffer is not used after the call of do_isofs_readdir()
but rather gets freed immediately.

-------------------------------------------------------------
namei.c:

static isofs_find_entry() uses the result of get_rock_ridge_filename()
as argument of static isofs_cmp(). I believe to trace it through
struct dentry_operations to isofs_dentry_cmp_common() in
fs/isofs/inode.c. Its use of struct qstr *name seems harmless
with a potential name->len == 255.

The name buffer tmpname is a parameter handed over by the only
caller isofs_lookup():
  struct page *page;
  page = alloc_page(GFP_USER);
The parameter tmpname is
  page_address(page)
As above: a "page" that can take 254 bytes should be able
to take 256 bytes.

This buffer is not used after the call of isofs_find_entry()
but rather gets freed immediately.

-------------------------------------------------------------

So s/254/256/ _should_ be safe ... if i am not mistaken.

On my Debian "unstable" VM i worked my way through
  http://kernel-handbook.alioth.debian.org/ch-common-tasks.html
(mainly paragraph 4.5) and did the smallest possible change
from "4" to "6".
The VM is now running with the changed kernel and shows
filenames up to 255 (US-ASCII) characters length.

I gave it a stress test by taring the content of a 3.2 GB
backup ISO with 67000 files, stemming from 20 years of
linuxing. Not really a challenge with exotic characters,
but it has the long names (for a HFS+ mangling test).

genisoimage is obviously willing to write Rock Ridge
names longer than 255 characters.
This gave a good opportunity to verify that the kernel
still prevents file names larger than MAX_SIZE.
(If this is mandatory at all ...)

(xorriso or libisofs silently truncate the filename to 255
 bytes. As their upstream maintainer i will have to investigate
 whether it's UTF-8-safe and why there is no warning.)

-------------------------------------------------------------

Have a nice day :)

Thomas

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.1.6 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)


Reply to: