Re: Debian Squeeze ext4 corruption under VMware
On Tue, 15 May 2012 17:00:02 +0200
David Parker <dparker@utica.edu> wrote:
> Hello,
>
> We have several Debian Squeeze servers running in VMs under VMware ESXi
> 4.1.0 (build 502767 if that matters) with the latest VMware Tools bundle
> installed. We're using the ext4 filesystem on each of these servers.
> We have had a few crashes of our VMware infrastructure, and each time,
> the Debian servers have all suffered filesystem corruption. The problem
> seems to be that VMware attempts to "freeze" each VM when something goes
> wrong, and depending on the circumstances, it tries to move each VM to
> another VMware server. This works fine for our Windows servers, but the
> Debian servers get all messed up. Each VM remains in a "running" state,
> but the root filesystem is mounted read-only, and the console shows a
> ton of filesystem errors. In most cases, the corruption has been
> recoverable by booting the VM to a Knoppix live CD and running fsck on
> the unmounted filesystem. We've tried forcing fsck to run on boot, but
> for some reason it will not repair the filesystem, hence why we need to
> boot to a live CD. In a few isolated cases, we have ended up with
> serious filesystem damage resulting in a huge number of files in
> /lost+found, and we've just rebuilt the VMs.
>
> I'm just wondering if anyone else has seen this, or if anyone knows a
> way to make Debian deal with VMware's shenanigans more smoothly. We do
> have a planned upgrade to VMware ESXi 5.0 in the next few months, and
> we're looking to get a new SAN solution (our SAN has been the source of
> at least two of these crashes), but I'd really like to get a handle on
> this issue sooner in case we have another problem. I've Googled this
> problem, but I'm not finding much useful information.
>
> Thanks!
>
> - Dave
>
from man mount :
barrier=0 / barrier=1 / barrier / nobarrier
This enables/disables the use of write barriers in the
jbd code. barrier=0 disables, barrier=1 enables. This also
requires an IO stack which can support barriers, and if jbd
gets an error on a barrier write, it will disable again with a
warning. Write barriers enforce proper on-disk ordering of
journal commits, making volatile disk write caches safe to use,
at some performance penalty. If your disks are battery-backed
in one way or another, disabling barriers may safely improve
performance. The mount options "barrier" and "nobarrier" can
also be used to enable or disable barriers, for consistency
with other ext4 mount options.
The ext4 filesystem enables write barriers by default.
this is what I used on an NFS-server serving filesystems from an emc
san years back, cause the emc san would sometimes be MIA for a scond or
two and then come back into existance...
--
--
//Jacob Gaarde
//Dont reply to my (apparent) e-mail address. Instead Use
//e-mail : jgaarde <at> gmail <dot> com
<http://www.linkedin.com/in/jacobgaarde>
Reply to: