[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: file systems

On 5/4/2011 6:44 PM, Boyd Stephen Smith Jr. wrote:
In<4DC1E009.30209@hardwarefreak.com>, Stan Hoeppner wrote:
On 5/2/2011 4:02 PM, Boyd Stephen Smith Jr. wrote:
They are also essential for any journaled filesystem to have correct
behavior in the face of sudden pwoer loss.

This is true only if you don't have BBWC.

No.  It is true even with BBWC.

No, it's not. Sorry I didn't find any Debian documentation to prove my point. I'll use Red Hat docs:


"For devices with non-volatile, battery-backed write caches and those with write-caching disabled, you can safely disable write barriers at mount time using the -o nobarrier option for mount. However, some devices do not support write barriers; such devices will log an error message to /var/log/messages (refer to Table 17.1, “Write barrier error messages per file system”)."

You will see such errors with very high end SAN arrays, as I previously mentioned. They simply don't support write barriers. Why? Because constantly flushing an entire 16-64 *GigaByte* battery or flash backed write cache, sitting in front of 2048 SAS drives, because 64 servers on the SAN keep issuing barriers at the rate of 10,000/second, is a mind numbingly dumb thing to do.


"Write barriers are also unnecessary whenever the system uses hardware RAID controllers with battery-backed write cache. If the system is equipped with such controllers and if its component drives have write caches disabled, the controller will advertise itself as a write-through cache; this will inform the kernel that the write cache data will survive a power loss."

Even with a a battery-packed RAID cache, like I have in my desktop,
executing without barrier can result in extra data loss that executing
with a barrier prevents.

Then I'd say you have a problem with your BBWC RAID controller in your
desktop.  Which BBWC RAID card do you have?

Areca ARC-1160.

Can you kindly point me to your past posts where you discussed this 'extra data loss' problem you experienced? After AC power loss, with your Areca-1160 w/ ARC-6120BA-T112 battery unit? I'd like to better understand the circumstances surrounding the data loss.

Of course, even with out barriers a properly journaled or log-structed
filesystem should be able to immediately and silently recover.

This contradicts what you stated above.

No, it doesn't.  The filesystem can recover by dropping or replaying journal /
log entries that were not yet flushed to disk.  That doesn't mean you haven't
lost any data, if parts of the journal that existed in cache before the power

The argument you made was that barriers are required to maintain correct journal write ordering. If that order isn't maintained because barriers are turned off, then, using your argument, the replaying of the 'out of order' log journal will likely corrupt the filesystem. You seem to arguing from both sides of the fence.

With barriers, you a guaranteed to be able to recover to the last barrier.
Without them, the hardware many have fully, partially, of not-at-all completed
virtually any I/O.

This is generally true, but depends on the 'hardware' you're referring to, as I've pointed out a few times now in this thread.

This is why (good) BBWC enabled RAID cards automatically disable the
caches on all the drives,

Mine provides the option.  I can't remember what setting I'm using right now.
IIRC, I continue to use the drives write cache because I have a UPS that
provides enough time for a clean shutdown, even when under load.

Given that you have both the ARC-6120BA-T112 RAID card battery and a UPS, I'm now really curious to know more about your data loss due to not using barriers.

and thus why it is recommended to disable
barriers for filesystems on BBWC RAID cards.

By whom?  Reference please.

Links and excerpts provided above.

The nobarrier results are far more relevant than the barrier results,
especially the 16 and 128 thread results, for those SAs with high
performance persistent storage.

I disagree entirely.  You should be looking at the threaded results,
probably 128 threads (depending on what the server does), but you should
also be using barriers.

You just said you "disagree entirely" and then say 128 threads, same
thing I said.  But then you recommend barriers, which is the disagreement.

You said 128 threads unconditionally, I admitted that there are certain
workloads where 16 threads is a more correct model.

The multi-thread tests are simply used to show how each filesystem scales with parallel workloads. Some servers will never see 16 parallel IO streams, such as most SOHO servers. Some servers will see thousands of simultaneous IO streams, such as the Linux kernel archives servers. There is no "correct model".


Reply to: