[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: kill: cannot kill some processes



> On Sat, Mar 03, 2001 at 08:52:36AM -0500, Cory Snavely wrote:
> > Right now on a big Solaris machine of mine I have about a dozen zombied
> > Perls--parent process (Apache) long gone, and when I -9ed them, their
PPIDs
> > became 1 (init). Classic zombie.
>
> Hrrrm?  Not quite.  Init eventually inherits zombie children (when the
> parent dies), but init reaps the dead children.  Perhaps your children
> aren't dead?

Brian, you're right. Now that I look more closely, they're in "sleep" state.
If I just knew why...

> > Problem is, these Perls are running scripts off a software RAID, and
thus
> > have it locked. This happened before--when I reboot the server to get
rid of
> > the zombies, or some other reason, the filesystem won't unmount, won't
get a
> > clean flag, and therefore will force fsck on reboot. As it's over 100GB,
a
> > full fsck takes several hours.
> >
> > Now maybe there's something I don't know to recover from this cleanly,
or
> > maybe Linux handles it a different way, but it seems like this is an
example
> > of zombies causing a real problem. If anyone knows a way around it, I'd
be
> > real grateful!
>
> Doesn't sound like a zombie to me.  A zombie has -no- open files and goes
> away as soon as init inherits it.  A zombie is in state 'Z' on ps.
>
> What you describe sounds more like something in state 'D', which is
> waiting for IO to complete.  (This can happen on NFS when things break in
> just the wrong way for some reason.)  They're not zombies because they're
> not dead yet (they need to release their files before they are really
> dead).
>
> For processes stuck in a 'D' state, there is very little you can do about
> them.  You may be able to sneak out of re-fscking by remounting the drive
> read-only before rebooting, though.

Yeah, that's what I was thinking. Thanks.





Reply to: