on Sat, May 05, 2001 at 01:10:01PM -0700, Karsten M. Self (kmself@ix.netcom.com) wrote: > on Sat, May 05, 2001 at 02:14:20PM -0500, Nathan E Norman (nnorman@micromuse.com) wrote: > > I was always taught to use a block size of 512 when {read,writ}ing to > > floppies, and double the kByte size of the floppy to obtain the count > > e.g. > > > > dd if=/dev/fd0 of=floppy.img bs=512 count=2880 > > > > I'll poke around to see if I can discover _why_ I've been doing it > > this way for the last 5 years ... I believe the physical block size of > > a floppy is 512 however. > > The only place I've found blocksize _really_ matters is in writing to > tape. I've had endless problems in my former existence as a data > analyst, usually working with data sourced from mainframes, in which > blocking factors were critical. Also more recently when I'd managed to > change the blocking factor while producing some backups to SCSI tape > (DDS), but hadn't changed them for the following read/restore attempt. > > I don't quite understand the whole thing myself. For dd, AFAICT, > blocking really only matters to the extent that you want to scale your > 'count=' value to the blocksize, and the speed and buffering resulting. > > 512 was the traditional blocksize on a wide range of systems: DOS, VMS, > and many old Unix systems (OpenBSD still reports 'df' output in 512K > blocks). > > It probably doesn't matter that you're reading in 512 byte increments so > long as whatever you're doing is in increments of 512 bytes, hence 1024, > 2048, 4096, 8192, etc. The block factoring of a floppy is handled by > the filesystem itself -- when you're imaging the disk you're bypassing > this entirely. Sweet spot is probably determined by kernel disk > buffering, head speed, and memory. > > Any hardware boffins here? Well, I ran a series of tests reading from a minix-formatted 1.4MB floppy using different blocksizes, and flushing buffers by reading and writing 140MB of data between tests. Typical read was: time dd if=/dev/fd0 of=/tmp/fd0 bs=1024 count=1440 Flush was accomplished with: time dd if=/dev/zero of=/tmp/fdflush bs=1024 count=144000 Results are, um, suspiciously consistent. Times in seconds. Blocksize Time --------- -------- 512 0:48.475 1024 0:48.517 2048 0:48.487 4096 0:48.482 8192 0:48.674 16384 0:48.472 32768 0:48.483 65536 0:48.475 131072 0:48.493 Even switching to a small and non-uniform blocksize (10) doesn't change the results much -- a readtime of 48.517 seconds. The odd man out is the 8192 blocksize timing, more than two standard deviations from the mean (48.50644 seconds). Assuming that this is the result of a single-trial deviation, I rerun the trial and get 48.476 seconds. A plot of time by ln(blocksize) shows no clear correlation. [1] Correlation of time to log(blocksize) is -0.23. Negative and low. As the 1024 blocksize value is now an outlier, I rerun it, get 48.472 and a correlation coefficient of 0.339. A boxplot shows all values within two standard devations at this point. I'm going to say that blocksize, in copying to/from floppy, probably doesn't matter, from a performance perspective. Stats via r-base package. -------------------- Notes: 1. I'm using ln(blocksize) to linearize this factor. It makes interpretation of results somewhat simpler. -- Karsten M. Self <kmself@ix.netcom.com> http://kmself.home.netcom.com/ What part of "Gestalt" don't you understand? There is no K5 cabal http://gestalt-system.sourceforge.net/ http://www.kuro5hin.org
Attachment:
pgpfVfTDv7bxl.pgp
Description: PGP signature