[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: unexpected IFS behavior with the newline character



> Here is the relevant documentation from bash (1):
>    Word Splitting

Thanks. (It turns out that the relevant information is in the "Quoting" section of that document.)

The solution was to set the IFS variable to a literal value ^M with the $'' operator. The proper form of the given example is thus:

IFS=$'\n'
FOO=$'alpha\nbravo\ncharlie'

The IFS="\n" statement puts two characters into the variable, a backslash and the letter n, which isn't the desired result. The shell itself does not expand escape sequences outside of the $'' operator, although many programs like /bin/echo will interpret it, which led to my confusion.


Bijan Soleymani wrote:
On Sun, Oct 19, 2003 at 07:15:11PM -0400, Darik Horn wrote:

Can anybody explain this shell behavior? -- It doesn't seem consistent with the documentation about how whitespace is interpreted as an internal field separator.


Here is the relevant documentation from bash (1):
   Word Splitting
       The shell scans the results of parameter expansion,  command  substitu-
       tion,  and arithmetic expansion that did not occur within double quotes
       for word splitting.

       The shell treats each character of IFS as a delimiter, and  splits  the
       results of the other expansions into words on these characters.  If IFS
       is unset, or its value is exactly <space><tab><newline>,  the  default,
       then  any  sequence  of IFS characters serves to delimit words.  If IFS
       has a value other than the default, then sequences  of  the  whitespace
       characters  space  and  tab are ignored at the beginning and end of the
       word, as long as the whitespace character is in the value  of  IFS  (an
       IFS  whitespace  character).   Any  character  in  IFS  that is not IFS
       whitespace, along with any adjacent IFS whitespace characters, delimits
       a  field.  A sequence of IFS whitespace characters is also treated as a
       delimiter.  If the value of IFS is null, no word splitting occurs.

       Explicit null arguments ("" or '')  are  retained.   Unquoted  implicit
       null arguments, resulting from the expansion of parameters that have no
       values, are removed.  If a parameter with no value is  expanded  within
       double quotes, a null argument results and is retained.

       Note that if no expansion occurs, no splitting is performed.


# IFS="n"
# FOO="alphanbravoncharlie"
# for i in $FOO; do echo "%${i}%"; done
%alpha%
%%
%bravo%
%%
%charlie%

When ash or bash does the word splitting, a empty field is expanded. This behavior seems to happen only with the newline character.

# IFS="n"
# FOO="alphannnbravonncharlie"
# for i in $FOO; do echo "%${i}%"; done
%alpha%
%%
%%
%%
%%
%%
%bravo%
%%
%%
%%
%charlie%

In this case, the newline characters are not folded together like whitespace. Generally, there are '2n-1' null expansions for each 'n' instances of the newline character in the input word.


The critical part of the documentation is:
       The shell treats each character of IFS as a delimiter, and  splits  the
       results of the other expansions into words on these characters.  If IFS
       is unset, or its value is exactly <space><tab><newline>,  the  default,
       then  any  sequence  of IFS characters serves to delimit words.  If IFS
       has a value other than the default, then sequences  of  the  whitespace
       characters  space  and  tab are ignored at the beginning and end of the
       word, as long as the whitespace character is in the value  of  IFS  (an
       IFS  whitespace  character).   Any  character  in  IFS  that is not IFS
       whitespace, along with any adjacent IFS whitespace characters, delimits
       a  field.  A sequence of IFS whitespace characters is also treated as a
       delimiter.  If the value of IFS is null, no word splitting occurs.

So this works only "If IFS is unset, or its value is exactly
<space><tab><newline>, the default," only then do you get the
behaviour you want namely: "then any sequence of IFS characters serves
to delimit words". This explains why the newline characters are not
folded together.

The part of the documentation relevant to null arguments is:
       Explicit null arguments ("" or '')  are  retained.   Unquoted  implicit
       null arguments, resulting from the expansion of parameters that have no
       values, are removed.  If a parameter with no value is  expanded  within
       double quotes, a null argument results and is retained.

I don't think this says that null arguments will be removed. I think
that "null arguments, resulting from the expansion of parameters that
have no values, are removed" means that nulls are removed in a case
like this:
$ message="Hi!"
$ echo $null $message $null
hi
$
Hope that helps,
Bijan



Reply to: