[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: File copy method that is twice as fast as "cp -a".



hiya karl 

yes !! buffer is neat little program... forgot all about it...

and "good test proceedure"... at least easy enough to understand...

and to tweek it more... one could change(increment) the default rsize 
and wsize for the mount options
	- the days of slow tape drives...hummm...
	- stop go..stop...go...rewind....stop...go....rewind...
	=
	= stick in buffer...and watch(listen to) the tape hummm....

thanx
alvin
http://www.Linux-Backup.net ....


On 3 Jun 2001, Karl M. Hegbloom wrote:

> 
> 
>  I found a way of copying files from one drive to another that is
>  signifigantly faster than "cp -a"...  (this is just the sort of
>  geeky++ type stuff you guys like to read, I bet.)
> 
>  See if you can follow along here and see what I did.  The
>  "cvs.gnome.org" directory contains a checkout of the "gnome" and
>  "CVSROOT" modules only.
> 
> root@karl:~
> # du -hs /usr/local/src/cvs.gnome.org 
> 260M    /usr/local/src/cvs.gnome.org
> 
>  First, time the "cp -a".
> 
> root@karl:~
> # time cp -a /usr/local/src/cvs.gnome.org /mnt/tmp/src
> cp -a /usr/local/src/cvs.gnome.org /mnt/tmp/src  0.37s user 13.81s system 27% cpu 51.674 total
> 
>  Now let's try using "tar" commands.
> 
> root@karl:~
> # time (cd /usr/local/src/ && tar pcf - cvs.gnome.org) | (cd /mnt/tmp/src/ && tar pxf -)
> ( cd /usr/local/src/ && tar pcf - cvs.gnome.org )  0.77s user 8.02s system 16% cpu 53.759 total
> ( cd /mnt/tmp/src/ && tar pxf - )  0.68s user 12.58s system 24% cpu 53.757 total
> 
>  Hmmm.  That took slightly longer.  Let's try "cpio".
> 
> # time (cd /usr/local/src/ && find cvs.gnome.org -print0 | cpio -p0 /mnt/tmp/src)  
> 387800 blocks
> ( cd /usr/local/src/ && find cvs.gnome.org -print0 | cpio -p0 /mnt/tmp/src )  0.62s user 20.14s system 33% cpu 1:01.40 total
> root@karl:~
> # rm -rf /mnt/tmp/cvs.gnome.org
> 
>  That was a lot slower.  Both "find" and "cpio" must stat every file.
>  There is no benefit to having two processes at work here.
> 
>  Let's try something else.  I seem to recall seeing some kind of
>  buffering program meant for use when copying things across the
>  network or to a tape drive using "tar", one time when I ran "dselect"
>  and browsed the great plethora of available software packages...  A
>  quick "apt-cache search 'buffer'" gives me a 92 line list, from which
>  I choose the one I need:
> 
> root@karl:~
> # apt-get install 'buffer'
> Reading Package Lists... Done
> Building Dependency Tree... Done
> The following NEW packages will be installed:
>   buffer
> 0 packages upgraded, 1 newly installed, 0 to remove and 3  not upgraded.
> Need to get 12.6kB of archives. After unpacking 77.8kB will be used.
> Get:1 http://zeus.kernel.org unstable/main buffer 1.19-1 [12.6kB]
> Fetched 12.6kB in 0s (17.9kB/s)
> Selecting previously deselected package buffer.
> (Reading database ... 189510 files and directories currently installed.)
> Unpacking buffer (from .../buffer_1.19-1_i386.deb) ...
> Setting up buffer (1.19-1) ...
> 
> root@karl:~
> # buffer --help
> buffer: invalid option -- -
> Usage: buffer [-B] [-t] [-S size] [-m memsize] [-b blocks] [-p percent] [-s blocksize] [-u pause] [-i infile] [-o outfile] [-z size]
> -B = blocked device - pad out last block
> -t = show total amount written at end
> -S size = show amount written every size bytes
> -m size = size of shared mem chunk to grab
> -b num = number of blocks in queue
> -p percent = don't start writing until percent blocks filled
> -s size = size of a block
> -u usecs = microseconds to sleep after each write
> -i infile = file to read from
> -o outfile = file to write to
> -z size = combined -S/-s flag
> 
>  Ok, let's try it...
> 
> root@karl:~
> # time (cd /usr/local/src/ && tar pcf - cvs.gnome.org) | buffer -m 8m | (cd /mnt/tmp/src/ && tar pxf -)
> ( cd /usr/local/src/ && tar pcf - cvs.gnome.org )  0.55s user 6.11s system 12% cpu 53.905 total
> buffer -m 8m  0.14s user 2.10s system 4% cpu 53.914 total
> ( cd /mnt/tmp/src/ && tar pxf - )  0.84s user 16.51s system 32% cpu 53.910 total
> root@karl:~
> # rm -rf /mnt/tmp/src/cvs.gnome.org                                                                           
> root@karl:~
> # time (cd /usr/local/src/ && tar pcf - cvs.gnome.org) | buffer -m 8m -p 75 | (cd /mnt/tmp/src/ && tar pxf -) 
> ( cd /usr/local/src/ && tar pcf - cvs.gnome.org )  0.72s user 3.82s system 11% cpu 39.447 total
> buffer -m 8m -p 75  0.15s user 2.39s system 6% cpu 39.544 total
> ( cd /mnt/tmp/src/ && tar pxf - )  0.59s user 12.07s system 32% cpu 39.539 total
> 
>  Wow!  Not bad, huh?
> 
> 
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/ide/host0/bus0/target0/lun0/part3
>                        27G   16G   11G  59% /
> /dev/ide/host0/bus0/target0/lun0/part1
>                        29M  6.3M   21M  23% /boot
> shm                   2.8G     0  2.8G   0% /var/shm
> /dev/md/0              55G  234M   55G   1% /mnt/tmp
> 
> 
>  YMMV, since:
> 
> # hdparm -t /dev/hda3 /dev/md/0
> 
> /dev/hda3:
>  Timing buffered disk reads:  64 MB in  2.50 seconds = 25.60 MB/sec
> 
> /dev/md/0:
>  Timing buffered disk reads:  64 MB in  1.10 seconds = 58.18 MB/sec
> 
>  ... the RAID0 (software raid 0 on UDMA 100 EIDE) destination is much
>  faster than the source filesystem.  That is why filling the buffer
>  before starting to write helped the timing so much.  In this case,
>  having more than one process at work is beneficial.
> 
>  The situation between the "find | cpio" case and the "tar c | buffer
>  | tar x" case seems analagous to what we do in that if you just point
>  out the bugs, it takes longer for them to get fixed than if you
>  submit a patch.  Can you see what I mean by that?  In "find | cpio",
>  "find" is just walking the filesystem handing file names off to
>  "cpio" who must then stat and read each file itself, and then also
>  write it back out to the new location.  In the "tar c | buffer | tar
>  x" case though, the "tar c" is making its own list of files, then
>  packing them up and piping the whole bundle off to the buffer (our
>  BTS?), where it is then ready to be unpacked by the "tar x".  Hmmm.
> 
>  "cpio" doesn't know how to find, it just knows how to archive or copy
>  through...  Many of you don't know how to fix the code when you find
>  a bug, yet.  Nor do I.  Often enough it's way over my head.  Often
>  enough the BTS already contains a report about the bug I just found.
> 
>  :-) It's late and I'm rambling and I don't feel like editting this
>  story any longer.  Just thought I'd share my findings.  Hope it
>  helps someone.
> 
> -- 
> Karl M. Hegbloom
> mailto: karlheg@hegbloom.net
> 
> 
> -- 
> To UNSUBSCRIBE, email to debian-user-request@lists.debian.org 
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
> 



Reply to: