Re: LVM write performance
On 8/13/2011 9:45 AM, Ivan Shmakov wrote:
>>>>>> Stan Hoeppner <email@example.com> writes:
> > The horrible performance with bs=512 is likely due to the LVM block
> > size being 4096, and forcing block writes that are 1/8th normal size,
> > causing lots of merging. If you divide 120MB/s by 8 you get 15MB/s,
> > which IIRC from your original post, is approximately the write
> > performance you were seeing, which was 19MB/s.
> I'm not an expert in that matter either, but I don't seem to
> recall that LVM uses any “blocks”, other than, of course, the
> LVM “extents.”
> What's more important in my opinion is that 4096 is exactly the
> platform's page size.
> --cut: vgcreate(8) --
> -s, --physicalextentsize PhysicalExtentSize[kKmMgGtT]
> Sets the physical extent size on physical volumes of this volume
> group. A size suffix (k for kilobytes up to t for terabytes) is
> optional, megabytes is the default if no suffix is present. The
> default is 4 MB and it must be at least 1 KB and a power of 2.
> --cut: vgcreate(8) --
To use a water analogy, an extent is a pool used for storing data. It
has zero to do with transferring the payload. A block is a bucket used
to carry data to and from the pool.
If one fills his bucket only 1/8th full, it will take 8 times as many
trips (transfers) to fill the pool vs carrying a full bucket each time.
This is inefficient. This is a factor in the OP's problem. This is a
very coarse analogy, and maybe not the best, but gets the overall point
The LVM block (bucket) size is 4kB, which yes, does match the page size,
which is important. It also matches the default filesystem block size
of all Linux filesystems. This is not coincidence. Everything in Linux
is optimized around a 4kB page size, whether memory management or IO.
And to drive the point home that this isn't an LVM or RAID problem, but
a proper use of dd problem, here's a demonstration of the phenomenon on
a single low end internal 7.2k SATA disk w/16MB cache, with a partition
formatted with XFS, write barriers enabled:
t$ dd if=/dev/zero of=./test1 bs=512 count=1000000
512000000 bytes (512 MB) copied, 16.2892 s, 31.4 MB/s
t$ dd if=/dev/zero of=./test1 bs=1024 count=500000
512000000 bytes (512 MB) copied, 10.5173 s, 48.7 MB/s
$ dd if=/dev/zero of=./test1 bs=2048 count=250000
512000000 bytes (512 MB) copied, 7.77854 s, 65.8 MB/s
$ dd if=/dev/zero of=./test1 bs=4096 count=125000
512000000 bytes (512 MB) copied, 6.64778 s, 77.0 MB/s
t$ dd if=/dev/zero of=./test1 bs=8192 count=62500
512000000 bytes (512 MB) copied, 6.10967 s, 83.8 MB/s
$ dd if=/dev/zero of=./test1 bs=16384 count=31250
512000000 bytes (512 MB) copied, 6.11042 s, 83.8 MB/s
This test system is rather old, having only 384MB RAM. I tested with
and without conv=fsync and the results are the same. This clearly
demonstrates that one should always use a 4kB block size with dd, WRT
HDDs and SSDs, LVM or mdraid, or hardware RAID. Floppy drives, tape,
and other slower devices probably need a different dd block size.