Re: cp output format

To: debian-user@lists.debian.org
Subject: Re: cp output format
From: David Wright <deblis@lionunicorn.co.uk>
Date: Sun, 19 Jul 2015 21:14:48 -0500
Message-id: <[🔎] 20150720021448.GA16059@alum>
Mail-followup-to: debian-user@lists.debian.org
In-reply-to: <[🔎] 20150719095908.GA40755@phare.normalesup.org>
References: <[🔎] 20150716122313.GA4611@engels.historicalmaterialism.info> <[🔎] 55A8DED8.1080309@affinityvision.com.au> <[🔎] 20150717111601.GA2791145@phare.normalesup.org> <[🔎] 20150718223643.GB13747@alum> <[🔎] 20150719095908.GA40755@phare.normalesup.org>

Quoting Nicolas George (george@nsup.org):
> Le decadi 30 messidor, an CCXXIII, David Wright a écrit :
> > > And of course (unless the files are large (unlikely for .forward) and on the
> > > same mechanical drive), cmp file1 file2 is much simpler.
> > I may've missed something here. I can't think why computing the
> > md5/sha-2 digest would ever be better or simpler than cmp, even
> > if the files are large and/or on the same spindle).
> 
> You missed the end of the parenthesized text. Try this:
> 
> cmp /cdrom/300_megs_file_1 /cdrom/300_megs_file_2
> 
> ... and when you are done buying a replacement for your optical drive, you
> can tell me if cmp was really better than a hash.
> 
> The explanation is: If the files are large, then neither the application nor
> the kernel will read them at once. Therefore, with cmp, read will happen
> alternatively on each file until the end.

I see your point now. Fortunately I always put a .md5 file on CDs
which contains the digests of all the files. So I'll pass on trying it.

> If the file are not already present in the cache and are on the same
> mechanical drive, that means moving the read head hundreds of time. Even if
> it does not kill your drive, it will be awfully slow.
> 
> With hashes, unless you make the mistake of running the hashes in parallel
> thinking you will save time, the first file is read in full and then the
> second, and everything goes as fast as sequential reads.

I've use digests for pruning identical files from backups and they're
computed serially, fortunately. So by accident I hadn't run into the
problem you outline. But many thanks for elaborating.

Cheers,
David.

Reply to:

References:
- cp output format
  - From: Haines Brown <haines@histomat.net>
- Re: cp output format
  - From: Andrew McGlashan <andrew.mcglashan@affinityvision.com.au>
- Re: cp output format
  - From: Nicolas George <george@nsup.org>
- Re: cp output format
  - From: David Wright <deblis@lionunicorn.co.uk>
- Re: cp output format
  - From: Nicolas George <george@nsup.org>

Prev by Date: Re: Off topic, but has lists.sourceforge.net hung itself for good?
Next by Date: Re: Off topic, but has lists.sourceforge.net hung itself for good?
Previous by thread: Re: cp output format
Next by thread: RE: cp output format
Index(es):
- Date
- Thread