Re: MDADM RAID1 of external USB 3.0 Drives

To: debian-user@lists.debian.org
Subject: Re: MDADM RAID1 of external USB 3.0 Drives
From: Linux-Fan <Ma_Sys.ma@web.de>
Date: Mon, 22 Sep 2014 15:47:50 +0200
Message-id: <[🔎] 54202886.6050102@web.de>
In-reply-to: <[🔎] 8738bka1o9.fsf@yun.yagibdah.de>
References: <[🔎] 5411CAFC.1090307@web.de> <[🔎] 20140914000659.GH13278@randomstring.org> <[🔎] 20140914103819.GA4981@galactic.demon.co.uk> <[🔎] 54158103.9040000@web.de> <[🔎] 87iokqs241.fsf@yun.yagibdah.de> <[🔎] 5415BD5A.5020001@web.de> <[🔎] 87wq8ymjcz.fsf@yun.yagibdah.de> <[🔎] 541ECB77.2000806@web.de> <[🔎] 87r3z4ddfd.fsf@yun.yagibdah.de> <[🔎] 541F4451.3090507@web.de> <[🔎] 8738bka1o9.fsf@yun.yagibdah.de>

On 09/22/2014 03:23 AM, lee wrote:
> Linux-Fan <Ma_Sys.ma@web.de> writes:
>> On 09/21/2014 08:41 PM, lee wrote:
>>> Linux-Fan <Ma_Sys.ma@web.de> writes:
>>>>> On 09/20/2014 04:55 PM, lee wrote:
>>>>> Other than that, in my experience Seagate disks my have an unusually
>>>>> high failure rate.
>>>>
>>>> Mine all work here. SMART reports
>>>
>>> They'll work until they fail.  I don't believe in the smart-info.
>>
>> I do not trust SMART to be a reliable means of failure-prevention either
>> (the only failure I ever had occurred without any SMART warning), but
>> the "counters" especially for such normal things like power-on hours or
>> power-cycle counts are reliable as far as I can tell. Also, the drive is
>> used and filled with my data which all seems to be readable and correct.
> 
> I've seen the smart info show incredible numbers for the hours and for
> the temperature.  Hence I can only guess which of the values are true
> and which aren't, so I'm simply ignoring them.  And I never bothered to
> try to relate them to a disk failure.  When a disk has failed, it has
> failed and what the smart info says is irrelevant.

I always at least try to read/interpret the SMART data. I consider it
valuable information, although it is sometimes difficult to interpret.
(Especially Seagate's Raw Read Error Rate and some other attributes).

>>> You might be better off without this RAID and backups to the second disk
>>> with rsync instead.
>>
>> Also a good idea, but I had a drive failure /once/ (the only SSD I ever
>> bought) and although the system was backed up and restored, it still
>> took five hours to restore it to a correctly working state.
> 
> By using two disks attached via unreliable connections in a RAID1, you
> have more potential points of (unexpected) total failure in your setup
> than you would have if you were using one disk while working and the
> other one for backups.
> 
> For all you know, you can have a problem with the USB connections that
> leads to data on both disks being damaged at the same time or close
> enough in time as to make (some of) the data unrecoverable.  This is
> particularly critical for instances when the RAID needs to be rebuilt,
> especially when rebuilding it while you're working with the data.
> 
> You are using two disks and two unreliable connections at the same time
> because of the RAID.  That increases your chances of a connection going
> bad *and* of a disk failure: A disk which is just sitting there, like a
> backup disk, is pretty unlikely to go bad while it sits.  A non-existent
> connection cannot go bad at all.
> 
> When you don't use RAID but a backup, you are very likely to only lose
> the changes you have made since the last backup in case a disk fails.
> When you use RAID, you may lose all the data.

I did not know USB was that unreliable -- I am probably going to
implement these suggestions soon.

> This odd RAID setup you have doesn't even save you time in case a disk
> fails.  Are you really sure that you would want to rebuild the RAID when
> a disk has gone bad?  I surely wouldn't do it; I would make a copy
> before attempting a rebuild, and that takes time.
> 
> You're merely throwing dice with this and are increasing your chances to
> lose data.  What you achieve is disadvantages for no advantages at all.

I am no longer sure of the stability of my solution. I am surely also
going to try the "one connected, one for backup" variant as it would
simplify the setup and increase stability.

>> The failure itself was not the problem -- it was just, that it was
>> completely unexpected.
> 
> There are no unexpected disk failures.  Disk do fail, the only question
> is when.

I have not been using this technology long enough to come to this
conclusion. But in the end, all devices will fail at some point.

[...]

> With the amount of data we store nowadays, classic RAID is even more or
> less obsolete because it doesn't provide sufficient protection against
> data loss or corruption.  File systems like ZFS seem to be much better
> in that regard.  You also should have ECC RAM.

The data consistency is truly an issue. Still, I do not trust ZFS on
Linux or experimental Btrfs more than MDADM + scrubbing once per month.
Either of these "advanced" technologies add additional complexity which
I have tried to avoid so far. I did not expect that USB would prove such
an additional complexity.

ECC RAM is already there.

>> The instability lays in a single point: Sometimes, upon system startup,
>> the drive is not recognized. There has not been a single loss of
>> connection while the system was running.
> 
> not yet
> 
> What if your cat gets (or you get) entangled in the USB cables, falls
> off the table and thereby unplugs the disks?

Of course, I have taken action to prevent the disks from being
accidentally disconnected. Also, there are no pets here which could get
in the way.

> I admit that I'm a little biased because I don't like USB.  The only
> thing USB I'm using is my trackball because I haven't got an adaptor yet
> that would allow me to plug it into a PS/2 port.  So I can't really
> speak to the reliability of USB connections other than that with a card
> reader, it has been flaky and even crashy with an USB 3.0 card I wanted
> to connect an USB disk to until I did something I don't remember which
> somehow fixed it.  (The USB disk failed shortly after I bought it, but
> since I had some data on it, I couldn't replace it under warranty, so it
> was a total waste of money.)
> 
> When I enable power management for USB, my trackball falls asleep and I
> have to disable the power management to wake it up again.  How do your
> USB disks handle that?

In the currently running system, all USB works as reliable as I expect
it: Devices never lose connection and all work with reasonable latency
(for me). As the external storage is not accessed very often (I only use
it for a lot of big files which would otherwise need to be deleted and
additional VMs) the disks sometimes make a silent "click" when they are
accessed again.

> Like USB or not, USB requires polling.  What if your CPU is busy and
> doesn't have time to poll the data from the disks or has some hickup
> that leads to losing or corrupting data on the USB connection?  Even USB
> keyboards are way too laggy for me.

I have not experienced any keyboard lag yet and I did not know that USB
required polling. The quad core is rarely completely occupied, which is
probably why I never experienced such problems yet.

> The only advantage of USB I can see is that it is hot-pluggable.  It's a
> rather minor advantage to me.  (I still don't understand why they didn't
> make PS/2 hot-pluggable as its predecessor was.  That would be really
> useful ...)

In my case, the hot-plugging is also of minor relevance.

>> And, as already mentioned, having to rebuild an array about once a week
>> is a bad thing.
> 
> You wouldn't have this problem if you didn't use RAID with USB disks but
> one "life" disk and the other one for backups ...
> 
> AFAIK that's not a rebuild, it's some sort of integrity check, usually
> done once a week now.  It used to be done once a month --- what does
> that tell you?

As far as I can tell, this integrity check is once a month. The "one
week" refers to the "one failed boot per week" as a result of not
recognizing the drive.

>> Making the boot fail if the drive has not been recognized solved this
>> issue: I can reboot manually and the RAID continues to work properly,
>> because it behaves as if the failed boot had never occurred: Both drives
>> are "there" again and therefore MDADM accepts this as a normally
>> functioning RAID without rebuild.
> 
> I'm glad that you found a way to work around the problem --- yet I still
> recommend to reconsider the design of your storage system.

I will think about it again.

>> I do not /want/ to use RHEL (because otherwiese, I would indeed run
>> CentOS), I only wanted to express that if I did not have any time for
>> system maintenance, I would pay for the support and be done with all
>> that "OS-stuff".
> 
> ah, ok
> 
>> Instead, I now run a system without (commercial/granted) support and
>> therefore explicitly accept some maintenance by my own including the
>> ability/necessity to spend some time on configuring an imperfect setup
>> which includes USB disks.
> 
> And this time wouldn't be spent better on shutting down a server?

Actually, looking at it from now with some hours already spent on
configuring this setup, it would have made more sense to use a server
from the very beginning. However, I did not think so when I bought the
disks: I thought it would not matter which way the drives were connected
and assumed I could get a more reliable system if it did /not/ involve
another machine (I thought a server would only be necessary if I wanted
other systems to access the data as well). It turned out I was wrong
about that.

[...]

>>>> On the other hand, I have learned my lesson and will not rely on USB
>>>> disks for "permantently attached storage" again /in the future/.
>>>
>>> USB isn't even suited for temporarily attached storage.
>>
>> If I had to backup a medium amount of data, I would (still) save it to
>> an external USB HDD -- why is this such a bad idea?
> 
> see above
> 
> It's also a bad idea because USB disks are awfully slow and because you

Compared to what is already in the system (two 160 GB disks and two 500
GB disks) the "new" USB disks are actually slightly faster. Good 10k rpm
or 15k rpm disks will of course be much faster, but I do not have any.

[...]

>> Also, my backup strategy always includes the simple question: "How
>> would I access my data from any system?" "Any system" being thought of
>> as the average Windows machine without any fancy devices to rely on.
> 
> That terribly limits you in your choice of file systems, unless you have
> a CD/DVD/USB-stick with a bootable live system on it --- provided that
> you can boot it because you have a non-UEFI machine around or a live
> system that boots on UEFI ...
> 
> What FS do you use on your USB disks?  And how do you read your software
> RAID when you plug your disks into your "average Windows machine"?

Usually, I place a live USB stick and three live disks (one 32 bit CD,
one 32 bit DVD and one 64 bit CD) to be able to run Linux under most
circumstances. Otherwise, I backup important data (which is luckily not
that much) to 16 GB CF cards formatted with FAT32 (I have taken measures
to keep UNIX file permissions and special files like FIFOs etc. intact).
In the worst case scenario, the RAID can not be read on Windows, but
being able to run a live system I could access all the data (which I
could not do with tapes because of the missing device to read them).

Linux-Fan

-- 
http://masysma.lima-city.de/

Attachment: signature.asc
Description: OpenPGP digital signature

Reply to:

Follow-Ups:
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: lee <lee@yun.yagibdah.de>

References:
- MDADM RAID1 of external USB 3.0 Drives
  - From: Linux-Fan <Ma_Sys.ma@web.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: Dan Ritter <dsr@randomstring.org>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: "Andrew M.A. Cater" <amacater@galactic.demon.co.uk>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: Linux-Fan <Ma_Sys.ma@web.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: lee <lee@yun.yagibdah.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: Linux-Fan <Ma_Sys.ma@web.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: lee <lee@yun.yagibdah.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: Linux-Fan <Ma_Sys.ma@web.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: lee <lee@yun.yagibdah.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: Linux-Fan <Ma_Sys.ma@web.de>
- Re: MDADM RAID1 of external USB 3.0 Drives
  - From: lee <lee@yun.yagibdah.de>

Prev by Date: Re: There is no choice
Next by Date: Re: Booting Debian GNU/kFreeBSD on MacBookPro 8.2
Previous by thread: Re: MDADM RAID1 of external USB 3.0 Drives
Next by thread: Re: MDADM RAID1 of external USB 3.0 Drives
Index(es):
- Date
- Thread