[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: postfix cleanup stuck (cannot kill it)



Erik Steffl wrote:
> 
>   it happened twice in last few weeks: the cleanup (part of postfix)
> eats up all the cpu cycles and cannot be killed (even with -9).
> 
>   I guess that means that it's stuck in system call - it looks like
> kernel problem. any ideas on what to do (well, I know I can reboot:-)?
> how to troubleshoot?
> 
>   this is debian unstable, postfix 1.1.7-7, kernel 2.4.18. this doesn't
> seem to be related to kernel upgrade (which I did long time ago), I am
> not even sure if it's postfix related, it seems like updated some part
> of postifx during may
> 
>   following messages in syslog might be related (other than I've found
> nothing suspicious):
> 
> Jun 20 11:12:02 localhost postfix/smtpd[6575]: warning: premature
> end-of-input from public/cleanup socket while reading input attribute
> name
> Jun 20 11:12:02 localhost postfix/smtpd[6575]: fatal: unable to connect
> to the public cleanup service
> Jun 20 11:12:03 localhost postfix/master[505]: warning: process
> /usr/lib/postfix/smtpd pid 6575 exit status 1
> Jun 20 11:12:03 localhost postfix/master[505]: warning:
> /usr/lib/postfix/smtpd: bad command startup -- throttling
> 
>   there's no bug for this (in postfix package), should I file one - does
> it look like cleanup bug? or kernel bug?
> 
>   TIA

  now this is really strange - after few days of load 99% and not being
able to interrupt cleanup (I attached gdb to it and hit ctrl-c) it was
finally interrupted and now it's in:

(gdb) where
#0  0x401f82e4 in open () from /lib/libc.so.6
#1  0x4004354c in rewrite_clnt_stream () from
/usr/lib/libpostfix-global.so.1
#2  0x40038151 in mail_stream_file () from
/usr/lib/libpostfix-global.so.1
#3  0x0804be05 in dict_changed ()
#4  0x08049a65 in dict_changed ()
#5  0x40027b85 in _init () from /usr/lib/libpostfix-master.so.1
#6  0x40027ce1 in _init () from /usr/lib/libpostfix-master.so.1
#7  0x40051e05 in event_loop () from /usr/lib/libpostfix-util.so.1
#8  0x400286ce in single_server_main () from
/usr/lib/libpostfix-master.so.1
#9  0x08049c4e in dict_changed ()
#10 0x4014e14f in __libc_start_main () from /lib/libc.so.6
(gdb) 

  I guess that since it's still in open function of libc it was stuck in
open system call?

  any ideas about what's going on?

	erik


-- 
To UNSUBSCRIBE, email to debian-user-request@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: