[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: MD5 listing format



On Sun, Nov 28, 2021 at 12:57:00PM -0600, David Wright wrote:
> I was taken by surprise by the following output from md5sum:
> $ echo special/*
> special/C:\nppdf32Log\debuglog.txt special/same-contents
> $ md5sum special/*
> \adfc1d2f1b1d6c7fcaa51e857c1a6f68  special/C:\\nppdf32Log\\debuglog.txt
> adfc1d2f1b1d6c7fcaa51e857c1a6f68  special/same-contents

Fun.

> I don't understand why it pollutes the first field in its output.

Well, it doesn't bother to *document* why it does this, so we can only
guess (or source-dive).

> I would have thought it sufficient to mangle the filename if it
> feels it has to (echo doesn't bother).

Perhaps it prepends the \ character to the output line to indicate to
whoever's reading this file (which may be md5sum itself, in --check
mode) that a filename mangling *has occurred* and needs to be accounted
for.

Otherwise, how would the reader know whether the filename is actually

C:\\nppdf32Log\\debuglog.txt

or

C:\nppdf32Log\debuglog.txt

... and, upon further investigation, it turns out md5sum is part of GNU
coreutils.  Which means the man page that I've been reading *is not the
documentation*.  Fuckers.

In the blighted *info page*, there's this paragraph:

   For each FILE, ‘md5sum’ outputs by default, the MD5 checksum, a
space, a flag indicating binary or text input mode, and the file name.
Binary mode is indicated with ‘*’, text mode with ‘ ’ (space).  Binary
mode is the default on systems where it’s significant, otherwise text
mode is the default.  Without ‘--zero’, if FILE contains a backslash or
newline, the line is started with a backslash, and each problematic
character in the file name is escaped with a backslash, making the
output unambiguous even in the presence of arbitrary file names.  If
FILE is omitted or specified as ‘-’, standard input is read.


Reply to: