[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How to handle whitespace in filenames ???



Michael D. Schleif wrote:

> Craig Dickson wrote:
>
> > Michael D. Schleif wrote:
> > 
> > > How would you like to handle 0x08, 0x0a or 0x0d ???  Remember, we are
> > > talking about text handling here, not binaries . . .
> > 
> > We can sensibly limit ourselves to printable characters for filenames;
> > it's silly to suggest that if you let people use spaces, next they'll
> > want control characters.
> 
> How so?  My reply is in response to this: ``well, it's a valid
> character, why shouldn't it be there?''

Because you're not thinking in terms of what users realistically will
want to do. Why would anyone WANT to put a backspace into a filename? A
space is a printable character that serves a purpose in written text, to
separate words from one another, so a filename like "Letter to Joe.doc"
is meaningful, and is easier to read than LetterToJoe.doc. A backspace,
or other control character, makes no sense, and it wouldn't bother me in
the least if the filesystem simply didn't accept them. But spaces are
meaningful to people, and should be allowed and properly supported by
the shell and other standard tools.

Of course, if you want to admit that MacOS and Win32 can do something
better than Unix can -- which is the obvious implication of a lot of
what you've said on this subject --, be my guest.

> Simply because something can be done does not warrant doing it . . .

True, but in this case there's no sensible reason not to other than
backwards-compatibility. I'm not really arguing in favor of changing all
the thousands of Unix tools in the world to handle filenames with
spaces. For that matter, most of them handle such filenames fine under
limited circumstances, and it isn't that hard to write a shell script
that will do so. My specific point was that the standard Unix shells, by
default, expand variables in a way that causes a number of problems,
filenames with spaces being only one example.

> > There is a good reason to support spaces if you want your OS to appeal
> > to ex-Windows or ex-Mac users, who are used to creating filenames like
> > "Letter to Joe.doc" or "Smith Family Budget.xls".
> 
> I'll leave that debate for others -- nevertheless, this is one
> remarkable reason that windoze file handling is so weak ;>

Windows is defective in many ways, but the use of spaces in filenames
has nothing to do with any of them. That has to be the lamest straw man
argument I've ever come across; it amounts to nothing more than poking
fun at something that's unpopular with the crowd you're playing to,
without regard for truth or fairness.

> Besides, my point, as stated previously, is this, "Perhaps, you ought to
> ``correct'' the tools, then impose arbitrary complexity ???"
> 
> Please, do not put the cart before the horse . . .

You either haven't been reading very carefully, or aren't sufficiently
informed to have a sensible opinion, to judge by what you've said so
far. This "arbitrary complexity" thing is total nonsense.

To be specific, I've suggested that things would be better today if one
aspect of the Bourne shell had been designed better, namely how it
treats the expansion of variables whose values contain embedded spaces.
Right now, after the assignment A="foo bar", the expansion of $A is
interpreted by the shell and all Unix tools as two separate strings, foo
and bar, rather than the single string "foo bar", as it was originally
given (with quotes). This default behavior breaks filenames with spaces
(you have to remember to specify "$A", in quotes, to preserve the string
properly), and allows for a number of other problems as well, such as
the well-known security problems that can arise when setuid or cgi
scripts expand variables whose values were originally read from user
input.

It would have been better for the default expansion of a quoted string
to include the quotes, precisely to preserve strings with embedded
spaces. There could have been some trivial syntax to cause such strings
to be broken up, such as when you use a variable to store a list of
space-delimited strings. It wouldn't have been a major change, and it
would have been better.

There are also minor implications for other tools here; for example,
'ls' and 'find' should be able to quote filenames with embedded spaces
in their output, to allow them to be parsed correctly. But this is still
a minor shift, and had it been made early enough in Unix's history, it
would have been a small thing, and overall an improvement. I don't think
it can happen now, at least not in the context of the sh-derived shells
and the traditional Unix tools; you'd break too many things. If someone
wants to design a new shell from the ground up, and get this right, it
would be nice.

> > Unfortuantely, since spaces in filenames have never been a priority for
> > Unix users, most Unix tools behave counter-intuitively (from the
> > perspective of someone new to the system) when confronted with such
> > things.
> 
> In general, your examples are very weak.  Are you familiar with $IFS and
> its ilk?

Sure. But there's absolutely no need to alter the basic token-parsing in
the shell; it already understands strings with spaces as long as they're
quoted. All that's necessary is that the expansion of variables
containing such strings retain the quotes. It shouldn't be necessary to
alter IFS just to get reasonable behavior out of the shell.

Craig



Reply to: