Re: suboptimal cat usage

To: Frans Pop <elendil@planet.nl>
Cc: debian-boot@lists.debian.org
Subject: Re: suboptimal cat usage
From: Philip Hands <phil@hands.com>
Date: Wed, 28 May 2008 18:53:32 +0100
Message-id: <[🔎] 20080528175332.GG5361@hands.com>
In-reply-to: <[🔎] 200805271715.13880.elendil@planet.nl>
References: <[🔎] 20080527135329.GR8806@hands.com> <[🔎] 200805271715.13880.elendil@planet.nl>

On Tue, May 27, 2008 at 05:15:04PM +0200, Frans Pop wrote:
> On Tuesday 27 May 2008, Philip Hands wrote:
> > I thought I'd have a look through the d-i code for instances of:
> >
> >   cat ... |
> >
> > to see if I could find any code with room for improvement, and came up
> > with the attached patch.  Any comments before I commit these?  (along
> > with relevant changelog entries, of course, and removing the comment
> > about removing a space from the disk name that's only there to annotate
> > the patch)
> 
> I've had a quick look, but after only a few items I have some doubts...
> 
> > -		[ $(cat "$file" | wc -l) -gt 1 ] || continue
> > +		[ $(wc -l "$file") -gt 1 ] || continue
> 
> This is plain broken:
> $ cat massbuild.versions | wc -l
> 3
> $ wc -l massbuild.versions
> 3 massbuild.versions

Ah, well spotted -- typical that the one that I thought was _so_ obvious
that it didn't need testing, did need testing -- Doh!

That should be this instead:

  wc -l < massbuild.versions

"cat file | foo" can always be done more efficientlyi as "foo < file"
but I generally leave out the redirection if I can get away with it
(which of course in this case, I could not :-/ )

> I happen to know this one, but I bet there are there other commands that
> have subtle differences in output when using a file argument versus a pipe.

well, we could go for the redirection approach instead, but I'd be very
surprised if you found any more such errors, as I'm pretty sure that the
other examples were well tested (except that I'll admit to having tested
them under busybox on a full Debian system, rather than in a d-i system --
I'll test them all again under d-i proper if you think that there's a
danger that busybox varies in its behaviour (which I _seriously_ hope
it doesn't)

> > -	for dir in $( (cat /tmp/mount.pre /tmp/mount.pre; mountpoints ) | \
> > -		     sort -r | uniq -c | grep "^[[:space:]]*1[[:space:]]" | \
> > -		     sed "s/^[[:space:]]*[0-9][[:space:]]//"); do
> > +	for dir in $(mountpoints | sort -r /tmp/mount.pre /tmp/mount.pre - | uniq -u); do
> 
> 1) I find the combination of using a pipe _and_ named files for sort
>    unintuitive and less readable

Yeah, I'll give you that -- I was wondering if that was such a good idea.

> 2) Where did the grep and sed statements disappear to?

they're a somewhat log-winded reimplementation of uniq -u

What they do is:

  uniq -c  --- list each unique line, prefixed with a count of occurrences
  grep ... --- grep out all the lines that start with a count of 1
  sed ...  --- strip off the 1

i.e. list all the lines that only occur once, which as uniq(1) says:

         -u, --unique
              only print unique lines

For improved readability, how about:

  for dir in $( (cat /tmp/mount.pre /tmp/mount.pre; mountpoints) | \
                sort -r | uniq -u); do

Cheers, Phil.

Reply to:

Follow-Ups:
- Re: suboptimal cat usage
  - From: Frans Pop <elendil@planet.nl>

References:
- suboptimal cat usage
  - From: Philip Hands <phil@hands.com>
- Re: suboptimal cat usage
  - From: Frans Pop <elendil@planet.nl>

Prev by Date: Re: d-i status wrt i386 & amd64 EFI machines
Next by Date: Re: Missing glyphs
Previous by thread: Re: suboptimal cat usage
Next by thread: Re: suboptimal cat usage
Index(es):
- Date
- Thread