Re: LVM write performance
On 08/09/2011 06:30 AM, Stan Hoeppner wrote:
> On 8/8/2011 11:03 PM, Stan Hoeppner wrote:
>> On 8/8/2011 2:00 PM, Dion Kant wrote:
>>> On 08/08/2011 03:33 PM, Stan Hoeppner wrote:
>>>> On 8/8/2011 1:25 AM, Dion Kant wrote:
>>>>> Dear list,
>>>>>
>>>>> When writing to a logical volume (/dev/sys/test) directly through the
>>>>> device, I obtain a slow performance:
>>>>>
>>>>> root@dom0-2:/dev/mapper# dd of=/dev/sys/test if=/dev/zero
>>>>> 4580305+0 records in
>>>>> 4580305+0 records out
>>>>> 2345116160 bytes (2.3 GB) copied, 119.327 s, 19.7 MB/s
>>>>>
>>>>> Making a file system on top of the LV, mounting it and write into a file
>>>>> is ok:
>>>>>
>>>>> root@dom0-2:/dev/mapper# mkfs.xfs /dev/sys/test
>>>>> root@dom0-2:/mnt# mount /dev/sys/test /mnt/lv
>>>>> root@dom0-2:/mnt# dd of=/mnt/lv/out if=/dev/zero
>>>>> 2647510+0 records in
>>>>> 2647510+0 records out
>>>>> 1355525120 bytes (1.4 GB) copied, 11.3235 s, 120 MB/s
>>>>>
>>>>> Furthermore, by accident I noticed that writing directly to the block
>>>>> device is oke when the LV is mounted (of course destroying the file
>>>>> system on it):
>>>>>
>>>>> root@dom0-2:/mnt# dd of=/dev/sys/test if=/dev/zero
>>>>> 3703375+0 records in
>>>>> 3703374+0 records out
>>>>> 1896127488 bytes (1.9 GB) copied, 15.4927 s, 122 MB/s
>>>>>
>>>>> Does anyone know what is going on?
>>>>>
>>>>> The configuration is as follows:
>>>> Yes. You lack knowledge of the Linux storage stack and of the dd
>>>> utility. Your system is fine. You are simply running an improper test,
>>>> and interpreting the results from that test incorrectly.
>>>>
>>>> Google for more information on the "slow" results you are seeing.
>>>>
>>> Hmm, Interpreting your answer, this behaviour is what you expect.
>>> However, I think it is a bit strange to find, with this "improper
>>> test", about a factor 10 difference between reading from and writing to
>>> a logical volume by using dd directly on the device file. Note that dd
>>> if=/dev/sys/test of=/dev/null does give disk i/o limited results.
>> Apparently you are Google challenged as well. Here:
>> http://lmgtfy.com/?q=lvm+block+size
>>
>> 5th hit:
>> http://blog.famzah.net/2010/02/05/dd-sequential-write-performance-tests-on-a-raw-block-device-may-be-incorrect/
>>
>>> What is the proper way to copy a (large) raw disk image onto a logical
>>> volume?
>> See above, and do additional research into dd and "block size". It also
>> wouldn't hurt for you to actually read and understand the dd man page.
>>
>>> Thanks for your advise to try Google. I already found a couple of posts
>>> from people describing this similar issue, but no proper explanation yet.
>> I already knew the answer, so maybe my search criteria is what allowed
>> me to "find" the answer for you in 20 seconds or less. I hate spoon
>> feeding people, as spoon feeding is antithetical to learning and
>> remembering. Hopefully you'll learn something from this thread, and
>> remember it. :)
> BTW, you didn't mentioned what disk drive is in use in this test. Is it
> an Advanced Format drive? If so, and your partitions are unaligned,
> this in combination with no dd block size being specified will cause
> your 10x drop in your dd "test". The wrong block size alone shouldn't
> yield a 10x drop, more like 3-4x. Please state the model# of the disk
> drive, and the partition table using:
>
> /# hdparm -I /dev/sdX
> /# fdisk -l /dev/sdX
>
> Lemme guess, this is one of those POS cheap WD Green drives, isn't it?
> Just in case, read this too:
>
> http://wdc.custhelp.com/app/answers/detail/a_id/5655/~/how-to-install-a-wd-advanced-format-drive-on-a-non-windows-operating-system
>
> This document applies to *all* Advanced Format drives, not strictly
> those sold by Western Digital.
>
Hello Stan,
Thanks for your remarks. The disk info is given below. Writing to the
disk is oke when mounted, so I think it is not a hardware/alignment
issue. However your remarks made me do some additional investigations:
1. dd of=/dev/sdb4 if=/dev/zero gives similar results, so it has nothing
to do with LVM;
2. My statement about writing like this on an openSUSE kernel is wrong.
Also with openSUSE and the same hardware I get similar (slow) results
when writing to the disk using dd via the device file.
So now the issue has diverted to the asymmetric behaviour when
writing/reading using dd directly through the (block) device file.
Reading with dd if=/dev/sdb4 of=/dev/null gives disk limited performance
Writing with dd of=/dev/sdb4 if=/dev/zero gives about a factor 10 less
performance.
However, after mounting a file system on sdb4 (read only), I can use dd
of=/dev/sdb4 if=/dev/zero at (near) disk limited performance.
Now I used this trick to copy a large (raw) disk image onto an LVM
partition. I think this is odd. Can somebody explain why this is like it is?
Here is the disk info:
Model Family: Seagate Barracuda ES
Device Model: ST3750640NS
root@dom0-2:~# fdisk -l /dev/sdb
Disk /dev/sdb: 750.2 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000eae95
Device Boot Start End Blocks Id System
/dev/sdb1 1 244 1951744 fd Linux raid
autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2 244 280 292864 fd Linux raid
autodetect
Partition 2 does not end on cylinder boundary.
/dev/sdb3 280 7575 58593280 fd Linux raid
autodetect
Partition 3 does not end on cylinder boundary.
/dev/sdb4 7575 91202 671734784 fd Linux raid
autodetect
Partition 4 does not end on cylinder boundary.
root@dom0-2:~# hdparm -I /dev/sdb
/dev/sdb:
ATA device, with non-removable media
Model Number: ST3750640NS
Serial Number: 5QD193MQ
Firmware Revision: 3.AEK
Standards:
Supported: 7 6 5 4
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 1465149168
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
device size with M = 1024*1024: 715404 MBytes
device size with M = 1000*1000: 750156 MBytes (750 GB)
cache/buffer size = 16384 KBytes
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = ?
Advanced power management level: 254
Recommended acoustic management value: 254, current value: 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=240ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* DOWNLOAD_MICROCODE
* Advanced Power Management feature set
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
64-bit World wide name
Time Limited Commands (TLC) feature set
Command Completion Time Limit (CCTL)
* Gen1 signaling speed (1.5Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
Device-initiated interface power management
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Logical Unit WWN Device Identifier: 0000000000000000
NAA : 0
IEEE OUI : 000000
Unique ID : 000000000
Checksum: correct
Dion
Reply to: