[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: par2



On 09/08/14 06:02 PM, AW wrote:
On Sat, 09 Aug 2014 16:37:52 -0400
Gary Dale <garydale@torfree.net> wrote:

  > The speed of the check is usually limited by the speed of reading the
  > file(s) from disk. A par2 check is more direct and will also
  > automatically repair any bit rot that has developed.

Definitely not.

For very small files nearly all methods of error checking are about the same.
For large files, there are massive time differences between md5, sha1, par2.
The longest time, by far, is par2 checking.  I even did a simple check myself
to ensure this is true...

Here are the results:

Summary:
par2 verify about double time than sha1 for large files.
sha1 verify about double time than md5 for large flies.

par2 creation about 21 times longer than sha1 generation for large files.
sha1 creation about double time than md5 for large files.
----------------------------------------------------------------------------------------------------
Details:
For check generation:
10 x 1024 files for md5sum generation
Elapsed time is 0.00465393 seconds.

10 x 1024 files for sha1sum generation
Elapsed time is 0.00407004 seconds.

3 x 1GB and 2 x 2GB files for md5sum generation
Elapsed time is 13.0712 seconds.

3 x 1GB and 2 x 2GB files for sha1sum generation
Elapsed time is 22.3703 seconds.

10 x 1024 files for par2 generation
Elapsed time is 0.0724349 seconds.

3 x 1GB and 2 x 2GB files for par2 generation
Elapsed time is 471.907 seconds.
----------------------------------------------------------------------------------------------------
For verify of check:
10 x 1024 files for md5sum verify
Elapsed time is 0.00395489 seconds.

10 x 1024 files for sha1sum verify
Elapsed time is 0.00317788 seconds.

3 x 1GB and 2 x 2GB files for md5sum verify
Elapsed time is 12.9887 seconds.

3 x 1GB and 2 x 2GB files for sha1sum verify
Elapsed time is 22.6091 seconds.

10 x 1024 files for par2 verify
Elapsed time is 0.019568 seconds.

3 x 1GB and 2 x 2GB files for par2 verify
Elapsed time is 51.4989 seconds.
----------------------------------------------------------------------------------------------------
CPU:
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 26
Stepping:              5
CPU MHz:               1600.000
BogoMIPS:              6414.40
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7
----------------------------------------------------------------------------------------------------
Memory:               24GB

--Andrew


Your results with respect to creation aren't really relevant because neither md5 nor sha create repair files. They can only tell you if the file has errors. You need to create the par2 repair files no matter what check you use.

It's the verify times that are important. Moreover, you have picked the weakest sha check which is little better than md5sum (160 bits versus 128).

I note also that you don't actually break out the CPU time from the disk i/o time for the different methods. You simply take the elapsed time. This doesn't show that the difference in times is CPU time and not I/O time.

Nor do you specify whether you created individual check files for each test file (normal for md5 and sha) or created a single check file for all the files being tested (the way par2 is usually used).

You do show that par2 seems to scale better than sha or md5 when dealing with large files. While the sha1sum is 6 times faster on the small file test, it is only twice as fast on the larger file test. However neither may be relevant in this situation because you don't actually specify how the testing was done.


Reply to: