[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: persistent storage hardware: recommendations, comments, and opinions please



-----BEGIN PGP SIGNED MESSAGE-----

dman <dsh8290@rit.edu> writes:

> Disks :
>
> Western Digital WD400BBRTL
>     40GB, 7200 rpm, Ultra ATA 100, 8.9 ms seek time, 2MB buffer, $130

I'd go with this one given the set you list.

> Western Digital WD180ABRTL-120
>     18GB, 5400 rpm, Ultra ATA 100, 12.0 ms seek time, 2MB buffer, $80
>
> Samsung SV4002H  (looks like a used disk)
>     40GB, 54000 rpm, ATA 100, $110
>
> The one shop also had 2 Maxtor disks, but I'm not sure I want another
> one of them.

For comparison, my config:

(2 Promise ATA100 controllers, one promise brand, one maxtor)
PDC20267: IDE controller on PCI bus 00 dev 68
PCI: Found IRQ 10 for device 00:0d.0
PDC20267: chipset revision 2
PDC20267: not 100% native mode: will probe irqs later
PDC20267: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
    ide2: BM-DMA at 0xac00-0xac07, BIOS settings: hde:DMA, hdf:DMA
    ide3: BM-DMA at 0xac08-0xac0f, BIOS settings: hdg:DMA, hdh:DMA
PDC20267: IDE controller on PCI bus 00 dev 78
PCI: Found IRQ 11 for device 00:0f.0
PDC20267: chipset revision 2
PDC20267: not 100% native mode: will probe irqs later
PDC20267: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
    ide4: BM-DMA at 0xc000-0xc007, BIOS settings: hdi:DMA, hdj:DMA
    ide5: BM-DMA at 0xc008-0xc00f, BIOS settings: hdk:DMA, hdl:DMA

(4 Maxtor 40G drives, all primaries)
hde: Maxtor 54610H6, ATA DISK drive
hdg: Maxtor 54610H6, ATA DISK drive
hdi: Maxtor 54610H6, ATA DISK drive
hdk: Maxtor 54610H6, ATA DISK drive
hde: 90045648 sectors (46103 MB) w/2048KiB Cache, CHS=89331/16/63, UDMA(100)

(Raw single disk performance)
# hdparm -tT /dev/hde
/dev/hde:
 Timing buffer-cache reads:   128 MB in  1.48 seconds = 86.49 MB/sec
 Timing buffered disk reads:  64 MB in  2.13 seconds = 30.05 MB/sec

The drives are setup as two pairs with RAID1 mirroring:

(sorry for the devfs device names -- oh well)
# cat /proc/mdstat
Personalities : [raid1] 
read_ahead 1024 sectors
md1 : active raid1 ide/host2/bus1/target0/lun0/part1[1] ide/host2/bus0/target0/lun0/part1[0]
      48064 blocks [2/2] [UU]
      
md2 : active raid1 ide/host2/bus1/target0/lun0/part2[1] ide/host2/bus0/target0/lun0/part2[0]
      979840 blocks [2/2] [UU]
      
md3 : active raid1 ide/host2/bus1/target0/lun0/part3[1] ide/host2/bus0/target0/lun0/part3[0]
      43993920 blocks [2/2] [UU]
      
md4 : active raid1 ide/host4/bus1/target0/lun0/part2[0] ide/host4/bus0/target0/lun0/part2[1]
      979840 blocks [2/2] [UU]
      
md5 : active raid1 ide/host4/bus1/target0/lun0/part3[0] ide/host4/bus0/target0/lun0/part3[1]
      43993920 blocks [2/2] [UU]
      
md1 is /boot (dedicated boot partition with ext2)
md2 is swap
md3 is a physical volume for LVM
md4 is swap
md5 is a physical volume for LVM

I then carve out logical paritions from the LVM.  One of the options
for creating the partitions is to use striping so that blocks are
evenly distributed between the two physical volumes.  Think of it as
RAID0 over RAID1 (RAID01 or reversed RAID10).

(performance on a MD is still great, though it doesn't look like I get
parallel reads)
# hdparm -tT /dev/md3
/dev/md3:
 Timing buffer-cache reads:   128 MB in  1.07 seconds =119.63 MB/sec
 Timing buffered disk reads:  64 MB in  2.09 seconds = 30.62 MB/sec

(lvm high level info)
# vgdisplay
- --- Volume group ---
VG Name               shaktivg
VG Access             read/write
VG Status             available/resizable
VG #                  0
MAX LV                256
Cur LV                13
Open LV               12
MAX LV Size           255.99 GB
Max PV                256
Cur PV                2
Act PV                2
VG Size               83.91 GB
PE Size               4.00 MB
Total PE              21480
Alloc PE / Size       16553 / 64.66 GB
Free  PE / Size       4927 / 19.25 GB
VG UUID               3HIYl1-85Vn-hDTy-yBW8-k7Hk-6ncq-NHdihK

[lots of LVs]

- --- Logical volume ---
LV Name                /dev/shaktivg/mp32lv
VG Name                shaktivg
LV Write Access        read/write
LV Status              available
LV #                   11
# open                 1
LV Size                10.00 GB
Current LE             2560
Allocated LE           2560
Stripes                2
Stripe size (KByte)    16
Allocation             next free
Read ahead sectors     120
Block device           58:8


- --- Physical volumes ---
PV Name (#)           /dev/md/3 (1)
PV Status             available / allocatable
Total PE / Free PE    10740 / 2382

PV Name (#)           /dev/md/5 (2)
PV Status             available / allocatable
Total PE / Free PE    10740 / 2545

I use XFS for most of my filesystems:
# grep mp3 /proc/mounts
/dev/shaktivg/mp32lv /nobackup/mp3 xfs rw 0 0
# df -h | grep mp3
/dev/shaktivg/mp32lv   10G  7.5G  2.5G  75% /nobackup/mp3

# hdparm -tT /dev/shaktivg/mp32lv

/dev/shaktivg/mp32lv:
 Timing buffer-cache reads:   128 MB in  1.64 seconds = 78.05 MB/sec
 Timing buffered disk reads:  64 MB in  2.36 seconds = 27.12 MB/sec

XFS and LVM are a great combination.  You can grow your filesystem
online without having to worry about your partitions.  XFS can't
shrink, but ext2/3 probably can.

In retrospect, I probably would *not* put / (root) in the LVM.  It
is nice to be able to grow your root and not have to allocate slice
for it.  But the initrd issues can be quite complex -- I haven't
gotten a bootable kernel since:

# uname -a
Linux shakti 2.4.9-xfs-lvm #1 Sun Nov 25 12:51:22 PST 2001 i686 unknown

and while it may not have anything to do with LVM's initrd, I haven't
had a change to fix it since this is a "production" machine.

> Can someone provide a comparison of Ultra ATA 100 and SCSI?  Which is
> faster and/or more reliable?  

I used to be a big fan of SCSI.  I can now get "good enough"
performance out of IDE for a heck of a lot less money.  On my "real
job" we have a 25+TB disk farm.  That is all fiber channel with a bit
of SCSI here and there.  IDE would have a very hard time coming close.
On anything under 500G, IDE is "generally" the right answer IMO.

> Controllers :
>
> Adaptec SCSI Card 2906
>     7 devices, non-bootable, 10 MBps transfer, $63
>     SCSI 1, SCSI 2, Fast SCSI 2

I believe this is a future domain core -- if so, piece of crap.  

> Adaptec ATA RAID 1200A
>     2 ATA/100 channels, RAID 0 1 0/1 JBOD, bootable, $100
>
> CompUSA (Silicon Image Sil0649CL160)
>     Ultra ATA 100, 2 channel, ACPI, $30

Not familiar with these.

You might want to look into the SiS ATA controllers.  Andre Hendrick
had some good stuff to say about 'em.  I've been happy with my ATA-100
promise controllers.  The newer TX2 and ATA-133 Promise controllers
still seem to have teething problems on linux (or at least it looks
that way on linux-kernel list).  One of my controllers is a Matrox
branded controller which is just a relabled Promise -- might be worth
looking into.

If you really want hardware RAID, look into 3Ware.  I've read good
stuff about all the controllers in RAID1 mode and the higher-end ones
are good for RAID5 (the lower end ones either suck performance wise or
loose data).

You can also look into the SCSI->IDE external raid enclosures --
though the price point for those is quite high.  You need a good scsi
controller to get peak performance.  They take IDE drives, so the cost
to fully populate them is quite reasonable.

>
> Does RAID restrict the combination of disks I can have?  IIRC RAID 0
> is no redundancy, and RAID 1 is simply maintaining two copies on
> separate disks.  If so, then wouldn't both disks need to be the same
> size?

Ideally you want disks with similar performance characteristics.

RAID0 is just striping.  Better performance, but you can loose your
data at any time.

RAID1 is mirroring.  The smaller disk limits how much you can mirror.
With hardware raid, the smaller disk is all you get.  With software
raid you can mirror up to the size of the smaller disk and then use
the ramainder of the big disk for "non-critical" data.  Beware of
starving that drive for I/O...

RAID5 needs at least 3 drives.  It is generally considered better for
read performance than for writer performance -- though with good
caching it isn't always a loose.  Read performance on a good RAID 5
can be incredible (since you can stream data from all your disks at
once).

RAID10 is a raid 1 over a raid 0.  This gives you good read/write
performance (raid 0) with redundancy (striping).  But you pay for your
disk...

- -- 
- -rupa

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3ia
Charset: noconv
Comment: Processed by Mailcrypt 3.5.6, an Emacs/PGP interface

iQEVAwUBPE+XLnHDM4ucEopdAQHi1wgAxk8V1u/raXKDcRSf9lwqeFUgMQxUxFu1
WkuUcTSxWBLlfJi3iefZQLJOT1q51HmX7YY98XwLmW1LTOqfd47WxSvRW0WDgpU0
FUHqFR1ZvVfc1qnLKGT924drug6k58LJbiNoVmAPh1M3T9O85mCOCgqfdTsT+SDh
LvIQmwHXjUWFjcfckOUTALcLVWqzlZzn+iXpY69g7OcUOaUqqQr3rytdy4EvC+hW
HxuHse1TQP1fZjpVIXX4ok715nEB06y1lLcuivxgPu9m3uWY7+VEJYMd/1Vyavhl
yS2TzoY0+RfTO6Q9Oq4eo0+QfzVUumSrO1sYyQVkEcJ8QBsD+vbHWg==
=n1TC
-----END PGP SIGNATURE-----



Reply to: