[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How to handle whitespace in filenames ???



Craig Dickson wrote:
> 
> Erik Steffl wrote:
> 
> > > Of course, if you want to admit that MacOS and Win32 can do something
> > > better than Unix can -- which is the obvious implication of a lot of
> > > what you've said on this subject --, be my guest.
> >
> >   ? how could that be? what's the difference (ms<->mac<->unix)? the only
> > difference is that in unix you are more likely to use command line and
> > so have to be aware of 'funny' characters that might be interpreted by
> > shell.
> 
> Oh, there isn't much difference at all. The point is that Windows and
> Mac users put spaces in filenames all the time and it's no problem, even

  cd c:\Program Files fails on win 98, works on win nt 4.x, not sure
about mac.

  however (win nt 4.x, some file.txt exists, notepad some file.txt
works):

C:\Program Files>copy some file.txt some other file.txt
The system cannot find the file specified.

  so I wouldn't say that you can type in filenames with spaces in
windows without any quotes and it would work, it works in some version
in some situations, which IMO is just confusing (and, of course, it
cannot work because ther is no way whether mkdir rrr rrr2 means one
directory with space or two directories). I wouldn't say that these
differences (between win and unix) are no more substantial than
differences among unix shells (note the example grep rrr "`find . -name
somename`" that works in tcsh but not in bash)

> in shell scripts. So if Michael is right that spaces in filenames are
> "bad" for Unix, then he's essentially admitting that Unix can't do
> something that Windows and Mac already do. Which is patently inane.

  again, while not exactly same it's basically same, you still have to
be careful about spaces and other special characters, both in windows
and unix (and I am pretty sure in mac too). that's for command line. gui
tools work (as long as there is one field for filename so it does not
have to be parsed) work with any characters in filenames as long as
these characters are allowed in filenames (but that's same for
unix/win/mac)

> >   (almost) all the tools do handle filenames with spaces. the only
> > problem is the shell, and in most cases it's solvable by being aware of
> > what's actually going on. so there's no need to change thousands of unix
> > tools.
> 
> Not just the shell. 'find' and 'ls' can't be told to put quotes around
> filenames with spaces. But you're right that the shell's treatment of

  that depends, when you are using find ... -exec you can (notice that
file list was passed as one argument):

panther:~...pokusy/space>find . -name name\* -exec ./listArgs "{}" \;
listArgs called with following arguments:
        0: <./listArgs>
        1: <./name list>
listArgs called with following arguments:
        0: <./listArgs>
        1: <./namelist>

  when you use gnu find you can use print0 (so that the filenames are
delimited by '\0' and then NO special characters cause any problems),
useful basically only with gnu xargs and -0 option (AFAIK)

  when you use gnu find you can also use -printf option to print it any
way you want

  ls won't quote the filenames (AFAIK) but you can still use find
instead of ls:-) you can use -b to print non-printable characters  in
\ddd form (it still prints spaces as spaces), you can also use -m to
separate entries by comma

  and if you're really desperate, use sed!

panther:~...pokusy/space>ls | sed -e 's/^.*$/"&"/'
"filelist"
"listArgs"
"listArgs.C"
"name list"
"namelist"
"referenceNamelist"

  of course, if it's your favourite thing to do (quote strings) create
an alias for the sed command and just use ls|q or something like that

  it's not that complicated but you have to know what you're doing...

  then again, it might be interesting to have the communication between
processes based on something else then just parsing plain text (not
entirely sure if it would be worth it), so that the program on the
receiving end of pipe would easily recognize separate filenames...


> variables is the worst problem, and you're also right that it isn't just
> a filenames issue.
> 
> >   IMO the only real way out of that mess is for shell variables to
> > behave like variables in programming languages (e.g. perl - you still
> > can get into similar kind of trouble but they are much easier to solve
> > and there's a lot more of the frequent situations where it works
> > properly). that would be fairly radical change for a shell...
> >
> >   as long as the variable names are replaced by the values of variables
> > and then the command is executed there will be the same kind of problems
> > as there are now
> 
> I think 99% of the problem could be handled by just having the shell
> put quotes around variable expansions that include embedded spaces:
> 
>     %A="foo bar"
>     %echo $A
>     "foo bar"
> 
> This would ensure that in a situation like
> 
>    for X in $A $B $C; do ... done

  yes, but in this case you are making a huge step - the variables in
shell are not like variables in 'normal' programming languages, they are
simply replaced by the text that is their value and then the command is
executed, so compared to 'normal' programming languages they are more
like macros. what you suggest is inconsistent with this, and IMO the
only way to make it work is to actually use 'real' variables (which,of
course, changes everything)

> that the values of A, B, and C are kept intact. And you would also want
> some trivial syntax to indicate that you don't want that behavior, so
> that the practice of putting lists in variables would still work.
> 
> >   there's nothing to prevent us to write another shell that would do it
> > THE ONLY RIGHT WAY:-) it might be interesting experiment. there's
> > already number of shells, some of them fairly different. isn't there
> > even perl shell?
> 
> Could be. I know there's a scheme shell, but that's a pretty radical
> difference, syntactically, from the Bourne-style shells.
> 
> >   then again, what if there's a quote in the quoted string? users are
> > crazy, they put the least expected characters into filenames. single
> > quotes are quite common (used as apostrophes). etc. and why shouldn't
> > they?
> 
> Yes, that's true. Spaces are really only one example. But you could
> fairly easily handle all printable characters except double-quotes,
> which is what Windows does. Bash will treat a single-quote as just

  you can't say it handles, only some versions and only in certain
specific cases.

> another character inside a double-quoted string, so apostrophes can be
> handled that way -- likewise ampersands.
> 
> Fundamentally, the fact that the usual Unix filesystems let you put
> pretty much any character, printable or otherwise, into filenames, makes
> it very difficult for a command-line shell to handle all cases well.

  yes, that's why there should be no shell programming (notice that in
most other programming languages you do not have those problems). for
simple things that the shell is for, well, use perl:-) and if you really
need to use shell, be careful.

> However, it isn't that hard to deal with most printable characters,
> because in a string encased in double-quotes, the shell will treat
> single-quotes as ordinary text.

  we're talking programs here mostly, and programs cannot handle 90%,
they should handle 100%. if we're talking interactive use (typing
filenames) it's easy, just use command line autocompletion and shell
does all the quoting for you.

> Personally, I avoid spaces in filenames (on Unix systems) religiously,
> but it seems silly to claim there's something bad about them just
> because the shell and a few other tools are misdesigned.

  well, in most cases, as have been shown, there is a way out. so it's
not really that bad (but I also wish it would be better).

	erik



Reply to: