[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: CR/LF



On Sun, Dec 11, 2022 at 08:16:35AM +0100, tomas@tuxteam.de wrote:
> That said. Greg, I was also shaken by your roaring tone.

Yeah, well, he was told the same thing, repeatedly, by multiple people,
and somehow he managed to ignore every single instance of it.

It's rather frustrating.

As a formal statement for anyone else who's reading this, who might
actually listen:


echo ${TEST}  does NOT show the contents of the TEST variable reliably.


For the following reasons:

1) ${TEST} is not a substitute for "$TEST".  They do not mean the same
   thing.  You MUST double-quote the variable when you expand it, or
   else the contents will undergo word splitting and pathname expansions.

   There are specific cases where the double-quotes may be omitted, but
   until you know what those cases are, it's best to use the quotes
   every time.  This is not one of those cases.

2) echo may interpret the content of TEST as an option (-n or -e), or it
   may interpret backslash sequences inside the content, depending on
   which shell you're using, and which platform you're on.

   The use of echo with variable arguments is therefore strongly
   discouraged.

3) echo usually, but not always, adds an additional newline character to
   the output.  In most cases, this is acceptable, even preferable.  But
   when the OP is complaining of an "extra CR/LF" [sic], but is using
   echo to produce the extra newline himself, well... there you have it.


If you would like to see the contents of a variable in bash, you have
a few viable choices:

1) If you just want to *see* it, as a human being, so that you can verify
   that it appears to be correct, use "declare -p varname".

   unicorn:~$ test=$(:)
   unicorn:~$ declare -p test
   declare -- test=""
   unicorn:~$ test=$(printf '%s\n' foo bar baz)
   unicorn:~$ declare -p test
   declare -- test="foo
   bar
   baz"

   With this output format, you can tell where the newlines are, and
   aren't.  Note that there is no trailing newline, because the command
   substitution has removed it.

   This format does have weaknesses, however.  One of them is CRs.
   If CRs are actually present, then you get this:

   unicorn:~$ test=$(printf '%s\r\n' foo bar baz)
   unicorn:~$ declare -p test
   declare -- test="foo
   bar
   "az

   A person experienced with CRs may spot the telltale sign, but for
   other people, it might be too subtle.

2) If you would like the script to print the content of the variable to
   stdout, with no modifications, so that some other program can read
   it and use it, use this:

   printf %s "$test"

   The double quotes are required.

   The %s format specifier is required, for two reasons:

      a) If the content of the variable begins with - (hyphen), printf
         might interpret it as an option, if it appears as the first
         argument.  But if it's behind the format specifier, it's safe.

      b) If the variable contains % or \ sequences, and it appears as
         the first argument, printf will treat it as a format specifier.

   Therefore you should never use printf "$test" without the %s.

3) If you'd like to see the content in a fully unambiguous way, to debug
   the thing that's producing it, use this:

   printf %s "$test" | hexdump -C

   unicorn:~$ test=$(printf '%s\r\n' foo bar baz)
   unicorn:~$ printf %s "$test" | hexdump -C
   00000000  66 6f 6f 0d 0a 62 61 72  0d 0a 62 61 7a 0d        |foo..bar..baz.|
   0000000e

   Here, the CRs (0d) and LFs (0a) are explicitly visible.

   You may substitute your own preferred hexdump variant, if you don't
   like this specific format, or if you don't have access to this
   specific tool.  Debian includes hexdump in bsdextrautils, which is
   a dependency of man-db (which is "important"), so everyone should
   have it.  I include this caveat just in case someone reading this
   (who isn't Jim P) is on a non-Debian system.

   Debian also includes "hd" which is a synonym for "hexdump -C".

   If you need a POSIX-compatible alternative, there's always od:

   unicorn:~$ printf %s "$test" | od -t a -t x1
   0000000   f   o   o  cr  nl   b   a   r  cr  nl   b   a   z  cr
            66  6f  6f  0d  0a  62  61  72  0d  0a  62  61  7a  0d
   0000016

   The hardest part of using od for this is remembering those options.
   With no options, you get a much less appealing output format.  One
   might even call it useless, at least for this particular application.
   The -Ax or -An option may also be desired, but is less important.


Reply to: