[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian Squeeze ext4 corruption under VMware



On Tue, 15 May 2012 17:00:02 +0200
David Parker <dparker@utica.edu> wrote:

> Hello,
> 
> We have several Debian Squeeze servers running in VMs under VMware ESXi 
> 4.1.0 (build 502767 if that matters) with the latest VMware Tools bundle 
> installed.  We're using the ext4 filesystem on each of these servers.  
> We have had a few crashes of our VMware infrastructure, and each time, 
> the Debian servers have all suffered filesystem corruption.  The problem 
> seems to be that VMware attempts to "freeze" each VM when something goes 
> wrong, and depending on the circumstances, it tries to move each VM to 
> another VMware server.  This works fine for our Windows servers, but the 
> Debian servers get all messed up.  Each VM remains in a "running" state, 
> but the root filesystem is mounted read-only, and the console shows a 
> ton of filesystem errors.  In most cases, the corruption has been 
> recoverable by booting the VM to a Knoppix live CD and running fsck on 
> the unmounted filesystem.  We've tried forcing fsck to run on boot, but 
> for some reason it will not repair the filesystem, hence why we need to 
> boot to a live CD.  In a few isolated cases, we have ended up with 
> serious filesystem damage resulting in a huge number of files in 
> /lost+found, and we've just rebuilt the VMs.
> 
> I'm just wondering if anyone else has seen this, or if anyone knows a 
> way to make Debian deal with VMware's shenanigans more smoothly.  We do 
> have a planned upgrade to VMware ESXi 5.0 in the next few months, and 
> we're looking to get a new SAN solution (our SAN has been the source of 
> at least two of these crashes), but I'd really like to get a handle on 
> this issue sooner in case we have another problem.  I've Googled this 
> problem, but I'm not finding much useful information.
> 
> Thanks!
> 
>      - Dave
> 
from man mount :

       barrier=0 / barrier=1 / barrier / nobarrier
              This enables/disables the use of write barriers in the
       jbd code.  barrier=0 disables, barrier=1 enables.   This  also
       requires  an  IO stack which can support barriers, and if jbd
       gets an error on a barrier write, it will disable again with a
       warning.  Write barriers enforce proper on-disk ordering of
       journal commits, making volatile disk write caches safe to use,
       at some performance penalty.  If your disks are battery-backed
       in one way or another, disabling barriers may safely improve
       performance.  The mount options "barrier" and "nobarrier" can
       also be used to  enable  or  disable barriers, for consistency
       with other ext4 mount options.

              The ext4 filesystem enables write barriers by default.


this is what I used on an NFS-server serving filesystems from an emc
san years back, cause the emc san would sometimes be MIA for a scond or
two and then come back into existance...


-- 
-- 
//Jacob Gaarde
//Dont reply to my (apparent) e-mail address. Instead Use 
//e-mail : jgaarde <at> gmail <dot> com
<http://www.linkedin.com/in/jacobgaarde>


Reply to: