Re: cp output format
Quoting Nicolas George (george@nsup.org):
> Le decadi 30 messidor, an CCXXIII, David Wright a écrit :
> > > And of course (unless the files are large (unlikely for .forward) and on the
> > > same mechanical drive), cmp file1 file2 is much simpler.
> > I may've missed something here. I can't think why computing the
> > md5/sha-2 digest would ever be better or simpler than cmp, even
> > if the files are large and/or on the same spindle).
>
> You missed the end of the parenthesized text. Try this:
>
> cmp /cdrom/300_megs_file_1 /cdrom/300_megs_file_2
>
> ... and when you are done buying a replacement for your optical drive, you
> can tell me if cmp was really better than a hash.
>
> The explanation is: If the files are large, then neither the application nor
> the kernel will read them at once. Therefore, with cmp, read will happen
> alternatively on each file until the end.
I see your point now. Fortunately I always put a .md5 file on CDs
which contains the digests of all the files. So I'll pass on trying it.
> If the file are not already present in the cache and are on the same
> mechanical drive, that means moving the read head hundreds of time. Even if
> it does not kill your drive, it will be awfully slow.
>
> With hashes, unless you make the mistake of running the hashes in parallel
> thinking you will save time, the first file is read in full and then the
> second, and everything goes as fast as sequential reads.
I've use digests for pruning identical files from backups and they're
computed serially, fortunately. So by accident I hadn't run into the
problem you outline. But many thanks for elaborating.
Cheers,
David.
Reply to: