[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: md5sum lots of files



Since MD5 can be rather resource intense, you may find the following
utility of some use in determining which files should be processed.

http://dev1.netkinetics.net/filetime/filetime.c

download, then gcc -o /usr/bin/filetime && strip /usr/bin/filetime

It returns a simple unix stamp of the last time a file was accessed.
Very handy in determining if its changed at all prior to processing it
again (or copying it, etc), very small, very cpu friendly. Let filetime
help decide which larger files need to be md5'ed, useful if you keep a
flat file index of times -> files your shell script can read.

Filetime can read and compare from a list of files, the --help argument
is pretty self explanatory. Its much less disk intense than find and a
bit more practical in unusual situations. 

The original code was written by Nicholas Clements of www.option-c.com ,
I'm simply maintaining and tweaking it. Hope you find it useful.

Its also really handy when comparing dns zone files from many places at
once (why we had it written in the first place). 

Best , 
-Tim

On Sun, 2006-10-22 at 00:57 -0400, Chris Walters wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> Hi,
> 
> Grok Mogger wrote:
> > Allan Wind wrote:
> >> On 2006-10-20T07:33:46-0700, Dave Carrigan wrote:
> >>> find . -type f -print0 | xargs -0 md5sum > /tmp/source.sums
> >>> cd /dest/dir
> >>> find . -type f -print0 | xargs -0 md5sum > /tmp/dest.sums
> >>> diff -u /tmp/source.sums /tmp/dest.sums
> >>
> >> Might need a sort in there before redirecting to files.
> >>
> >>
> >> /Allan
> >
> > How should I go about sorting it?
> 
> Since I couldn't find your original post, I thought I'd reply to this
> one.  I've seen a lot of suggestions on how to do what you want, which
> if I understand correctly is to do an md5sum on a lot of files, recursively.
> 
> Well, there is a program that will do exactly that.  It is called
> md5deep.  I don't know if it is available through Debian, but it is
> available through Gentoo (source code), and should also be available
> here:  http://md5deep.sourceforge.net
> 
> You may have to use the -0 (zero) along with the -r option if your files
> have spaces or other odd characters in their names.  Other than that it
> is very easy to use (you can pipe the output to a file, then use the
> regular md5sum program to check the sums).  It also supports SHA1,
> Tiger, Whirlpool and a couple of others.
> 
> Regards,
> Chris
> 
> - --
> Please feel free to check out my webpage
> at http://cwalters999.home.comcast.net/
> 
> 'I am not who I think I am.  I am who I
> think you think I am...'
> -----BEGIN PGP SIGNATURE-----
> 
> iD8DBQFFOvomUx1jS/ORyCsRCKY5AJ0f9FOYSM71kXEGSDfSRZKAeVMcrACdFgoY
> XtpsIWXS71wWPy/rw9c84hU=
> =W86p
> -----END PGP SIGNATURE-----
> 
> 



Reply to: