Re: MDADM RAID1 of external USB 3.0 Drives
Linux-Fan <Ma_Sys.ma@web.de> writes:
> On 09/21/2014 08:41 PM, lee wrote:
>> Linux-Fan <Ma_Sys.ma@web.de> writes:
>>>> On 09/20/2014 04:55 PM, lee wrote:
>>>> Other than that, in my experience Seagate disks my have an unusually
>>>> high failure rate.
>>>
>>> Mine all work here. SMART reports
>>
>> They'll work until they fail. I don't believe in the smart-info.
>
> I do not trust SMART to be a reliable means of failure-prevention either
> (the only failure I ever had occurred without any SMART warning), but
> the "counters" especially for such normal things like power-on hours or
> power-cycle counts are reliable as far as I can tell. Also, the drive is
> used and filled with my data which all seems to be readable and correct.
I've seen the smart info show incredible numbers for the hours and for
the temperature. Hence I can only guess which of the values are true
and which aren't, so I'm simply ignoring them. And I never bothered to
try to relate them to a disk failure. When a disk has failed, it has
failed and what the smart info says is irrelevant.
>> You rebuild the RAID within 45 seconds? And you realise that RAID has a
>> reputation to fail beyond recovery preferably during rebuilds?
>
> No, I did not rebuild because it is not necessary as the data has not
> changed and the RAID had not been assembled (degraded) yet.
>
> And the second statement was the very reason for me starting this thread.
Ah I see, that minimises the problem. I haven't followed all of the
thread.
>> You might be better off without this RAID and backups to the second disk
>> with rsync instead.
>
> Also a good idea, but I had a drive failure /once/ (the only SSD I ever
> bought) and although the system was backed up and restored, it still
> took five hours to restore it to a correctly working state.
By using two disks attached via unreliable connections in a RAID1, you
have more potential points of (unexpected) total failure in your setup
than you would have if you were using one disk while working and the
other one for backups.
For all you know, you can have a problem with the USB connections that
leads to data on both disks being damaged at the same time or close
enough in time as to make (some of) the data unrecoverable. This is
particularly critical for instances when the RAID needs to be rebuilt,
especially when rebuilding it while you're working with the data.
You are using two disks and two unreliable connections at the same time
because of the RAID. That increases your chances of a connection going
bad *and* of a disk failure: A disk which is just sitting there, like a
backup disk, is pretty unlikely to go bad while it sits. A non-existent
connection cannot go bad at all.
When you don't use RAID but a backup, you are very likely to only lose
the changes you have made since the last backup in case a disk fails.
When you use RAID, you may lose all the data.
This odd RAID setup you have doesn't even save you time in case a disk
fails. Are you really sure that you would want to rebuild the RAID when
a disk has gone bad? I surely wouldn't do it; I would make a copy
before attempting a rebuild, and that takes time.
You're merely throwing dice with this and are increasing your chances to
lose data. What you achieve is disadvantages for no advantages at all.
> The failure itself was not the problem -- it was just, that it was
> completely unexpected.
There are no unexpected disk failures. Disk do fail, the only question
is when.
> Now, I try to avoid this "unexpected" by using RAID.
That's really cool and I appreciate that. Now you only need to do it
right rather than using RAID to your disadvantage.
> Even if it is unstable, i.e. fails earlier than a better approach
> which was already suggested, I will have a drive fail and /be able to
> take action/ before (all of) the data is lost.
Your chances to take action before all of the data is lost are greater
when you don't use USB for RAID and have backups.
You can lose data without a drive failing by making a mistake.
Accidentially deleting your data is a possibility, and that goes so fast
you got no chance to stop it before considerable damage is done. You
can also suffer from a power surge which fries all your disks.
You do need backups, and you can easily have one without any additional
cost and even overall increased reliability.
RAID is never a replacement for backups, and the protection it provides
against data loss is limited. In the first place, RAID is a tool to
avoid downtime.
With the amount of data we store nowadays, classic RAID is even more or
less obsolete because it doesn't provide sufficient protection against
data loss or corruption. File systems like ZFS seem to be much better
in that regard. You also should have ECC RAM.
> The instability lays in a single point: Sometimes, upon system startup,
> the drive is not recognized. There has not been a single loss of
> connection while the system was running.
not yet
What if your cat gets (or you get) entangled in the USB cables, falls
off the table and thereby unplugs the disks?
I admit that I'm a little biased because I don't like USB. The only
thing USB I'm using is my trackball because I haven't got an adaptor yet
that would allow me to plug it into a PS/2 port. So I can't really
speak to the reliability of USB connections other than that with a card
reader, it has been flaky and even crashy with an USB 3.0 card I wanted
to connect an USB disk to until I did something I don't remember which
somehow fixed it. (The USB disk failed shortly after I bought it, but
since I had some data on it, I couldn't replace it under warranty, so it
was a total waste of money.)
When I enable power management for USB, my trackball falls asleep and I
have to disable the power management to wake it up again. How do your
USB disks handle that?
Like USB or not, USB requires polling. What if your CPU is busy and
doesn't have time to poll the data from the disks or has some hickup
that leads to losing or corrupting data on the USB connection? Even USB
keyboards are way too laggy for me.
The only advantage of USB I can see is that it is hot-pluggable. It's a
rather minor advantage to me. (I still don't understand why they didn't
make PS/2 hot-pluggable as its predecessor was. That would be really
useful ...)
> The only problem with that instability was that it caused the RAID to
> need a rebuild as it came up degraded (because one drive was missing).
You wouldn't have this problem if you didn't use RAID with USB disks but
one "life" disk and the other one for backups ...
> And, as already mentioned, having to rebuild an array about once a week
> is a bad thing.
You wouldn't have this problem if you didn't use RAID with USB disks but
one "life" disk and the other one for backups ...
AFAIK that's not a rebuild, it's some sort of integrity check, usually
done once a week now. It used to be done once a month --- what does
that tell you?
> Making the boot fail if the drive has not been recognized solved this
> issue: I can reboot manually and the RAID continues to work properly,
> because it behaves as if the failed boot had never occurred: Both drives
> are "there" again and therefore MDADM accepts this as a normally
> functioning RAID without rebuild.
I'm glad that you found a way to work around the problem --- yet I still
recommend to reconsider the design of your storage system.
> I do not /want/ to use RHEL (because otherwiese, I would indeed run
> CentOS), I only wanted to express that if I did not have any time for
> system maintenance, I would pay for the support and be done with all
> that "OS-stuff".
ah, ok
> Instead, I now run a system without (commercial/granted) support and
> therefore explicitly accept some maintenance by my own including the
> ability/necessity to spend some time on configuring an imperfect setup
> which includes USB disks.
And this time wouldn't be spent better on shutting down a server?
Don't get me wrong, I'm only asking because one of your arguments was
that you can't be bothered to shut down a server which you wouldn't want
to leave running all the time. If you like it better this way, that's
perfectly fine.
>>> On the other hand, I have learned my lesson and will not rely on USB
>>> disks for "permantently attached storage" again /in the future/.
>>
>> USB isn't even suited for temporarily attached storage.
>
> If I had to backup a medium amount of data, I would (still) save it to
> an external USB HDD -- why is this such a bad idea?
see above
It's also a bad idea because USB disks are awfully slow and because you
need to spend good money on a good USB enclosure so you can put a decent
disk in which is provided with sufficient cooling. For that money, or
little more, you can get a better solution. Only that solution is not
hot-pluggable, so you can as well have the backup disks internal just
like the others --- which, of course, also has disadvantages.
That's one more reason to have a server: I have the backup disks in my
desktop.
> Sure, most admins recommend tapes, but reading/writing tapes on a
> desktop requires equipment about as expensive as a new computer.
Yes, unfortunately, tape backups can be rather expensive.
Using more and more disks isn't a good solution, either: the more disks
you have in use, the more will fail. Add to that the impossibility to
detect silent data corruption with RAID, the questionable status of ZFS
for Linux and the not-yet-readiness of btrfs, and part of my "solution"
is to attempt to reduce the amount of data I'm storing and to reduce the
number of disks I have in use.
Other than that, getting some trays for the backup disks to make them
pluggable would be a good idea.
> Also, my backup strategy always includes the simple question: "How
> would I access my data from any system?" "Any system" being thought of
> as the average Windows machine without any fancy devices to rely on.
That terribly limits you in your choice of file systems, unless you have
a CD/DVD/USB-stick with a bootable live system on it --- provided that
you can boot it because you have a non-UEFI machine around or a live
system that boots on UEFI ...
What FS do you use on your USB disks? And how do you read your software
RAID when you plug your disks into your "average Windows machine"?
--
Knowledge is volatile and fluid. Software is power.
Reply to: