[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: recommendations for supported, affordable hardware raid controller.



On 2021-01-03 01:25, Andrei POPESCU wrote:
On Sb, 02 ian 21, 13:35:06, David Christensen wrote:

AIUI a journaling filesystem provides a two-step process to achieve atomic
writes of multiple sectors to disk -- e.g. a process wants to put some data
into a block here (say, a file), a block there (say, a directory), etc., and
consistency of the on-disk data structures must be preserved.  The journal
provides a two-step process whereby everything is written to the journal,
then everything is written to disk.

That would mean all data is written to the disk twice and would make a
journaling file system twice as slow compared to a non-journaling file
system; the journal is typically on the same storage.

Even if you put the journal on another storage, having the data written
there in parallel would basically result in a sort of RAID ;)

If either step is interrupted, the
filesystem driver will detect the failure and respond.  When done, either
all of the blocks have been updated on disk or none of the blocks on disk
have been changed.

My understanding of [1] is that the journal only keeps track of metadata
and/or data *updates*.

In case of a crash it can only tell you whether the write was
"completed". It has no way to know whether the data on disk is actually
correct.

[1] https://en.wikipedia.org/wiki/Journaling_file_system

When using a non-checksumming file system we are trusting the storage to
either return correct data or report a read error (bad block) in order
to restore it from the "other" copies (in case of RAID) or, worst case,
from backup.

If the storage returns *wrong* data instead a non-checksumming file
system will never notice as it has no concept of what the correct data
should be.

A RAID by itself is also unable to distinguish between correct and
incorrect data and may even overwrite the good data with the bad data.

If we are lucky the error is noticed *and* the backups contain the
correct data (the backup may have been created from the corrupt data).

Without good integrity checks it is also very difficult to tell how
probable such issues are.

Yes, data and metadata integrity checking is different than responding to errors report by integrated drive controllers and/or interface controllers. They address different failure modes. But Murphy's Law tells me there are others.


I would postulate that copy-on-write technology could be/ is already included in journaling file systems to improve efficiency.


As they say, "the devil is in the details".


But, I am not going to read a mountain of web pages/ books and/or crawl source code to find out. As a computer user and hobbyist developer, my challenges are learning what is available OOTB at a sufficient level to choose the appropriate pieces for, and apply them correctly to, my needs. My primary metrics are correctness and stability; I avoid "bleeding edge", "testing", "unstable", etc.. I must rely on other people to develop, test, document, package, and support what I use.


I await dm-integrity being included in the Debian Installer.


Ubuntu figured out how to do ZFS-on-root OOTB; I wish Debian would do the same.


David


Reply to: