[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[error] child process 32242 still did not exit, sending a SIGKILL



Hola.
Me estoy encontrando con algunos problemas cuando reinicia el apache al
rotar los logs.

Mi arquitectura es la siguiente:

# apache2
Server version: Apache/2.2.3
Server built:   Feb 12 2007 06:56:06

# php4 -v
PHP 4.4.4-8 (cli) (built: Nov 22 2006 23:41:03)
Copyright (c) 1997-2006 The PHP Group
Zend Engine v1.3.0, Copyright (c) 1998-2004 Zend Technologies

# cat /etc/debian_version
4.0

# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      :               Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping        : 9
cpu MHz         : 2999.748
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl cid cx16 xtpr lahf_lm
bogomips        : 6003.81
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      :               Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping        : 9
cpu MHz         : 2999.748
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl cid cx16 xtpr lahf_lm
bogomips        : 5999.74
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

# uname -a
Linux troglod1 2.6.18-3-amd64 #1 SMP Mon Dec 4 17:04:37 CET 2006 x86_64
GNU/Linux

El servidor esta actualizado.
La semana pasada, cuando se rotaban los logs, el "apache2 restart" no
era capaz de matar todos los procesos apache, quedandose uno pillado,
sin funcionar y sin posibilidad de arrancarlo. Cuando esto sucedía se
observaban estos errores en los logs del sistema:

Feb 25 06:30:01 Boix kernel: mrtg[9785] general protection
rip:2b786414df50 rsp:7fff46ac8cf0 error:0
Feb 25 06:30:02 Boix kernel: mrtg[9786] general protection
rip:2b26f43046df rsp:7fffb69534b8 error:0
Feb 25 06:33:51 Boix kernel: VM: killing process apache2
Feb 25 06:34:25 Boix kernel: VM: killing process apache2
Feb 25 06:37:58 Boix kernel: VM: killing process apache2
Feb 25 06:45:50 Boix kernel: VM: killing process apache2
Feb 25 06:45:54 Boix kernel: mm/memory.c:109: bad pud
ffff81000fef7000(300000002b261067).
Feb 25 06:45:54 Boix kernel: apache2[10538]: segfault at
0000000000000000 rip 0000000000000000 rsp 00007fff21371658 error 4
Feb 25 06:45:54 Boix kernel: ----------- [cut here ] --------- [please
bite here ] ---------
Feb 25 06:45:54 Boix kernel: CPU 0
Feb 25 06:45:54 Boix kernel: Modules linked in: xt_tcpudp xt_limit
xt_state iptable_nat ip_nat ip_conntrack_ftp ip_conntrack nfnetlink
iptable_filter ip_tables x_tables ipv6 button ac battery dm_snapshot
dm_mirror dm_mod loop snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm
snd_timer snd snd_page_alloc evdev irtty_sir serio_raw shpchp
pci_hotplug parport_pc parport sir_dev i810_audio ac97_codec intel_agp
i2c_i801 pcspkr psmouse irda i2c_core soundcore floppy crc_ccitt ext3
jbd mbcache sd_mod piix ata_piix generic ide_core uhci_hcd ehci_hcd
libata scsi_mod 8139too 8139cp mii thermal processor fan
Feb 25 06:45:54 Boix kernel: Pid: 10538, comm: apache2 Not tainted
2.6.17-2-amd64 #1
Feb 25 06:45:54 Boix kernel: RIP: 0010:[<ffffffff8023814d>]
<ffffffff8023814d>{exit_mmap+226}
Feb 25 06:45:54 Boix kernel: RSP: 0000:ffff81002a9b9d08  EFLAGS: 00010206
Feb 25 06:45:54 Boix kernel: RAX: 0000000000000000 RBX: ffff81000100a2a0
RCX: 0000000000000045
Feb 25 06:45:54 Boix kernel: RDX: 0000000000000001 RSI: ffff81003832ee30
RDI: 000000000003832e
Feb 25 06:45:54 Boix kernel: RBP: 0000000000000000 R08: ffff8100011bc018
R09: 0000000000000000
Feb 25 06:45:54 Boix kernel: R10: 000000000000000b R11: ffff81000100a470
R12: ffff8100369ecec0
Feb 25 06:45:54 Boix kernel: R13: 0000000000000001 R14: ffff81002a9b9ef8
R15: ffff8100116ce7d8
Feb 25 06:45:54 Boix kernel: FS:  0000000000000000(0000)
GS:ffffffff80509000(0000) knlGS:0000000000000000
Feb 25 06:45:54 Boix kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Feb 25 06:45:54 Boix kernel: CR2: 0000000000000000 CR3: 0000000000201000
CR4: 00000000000006e0
Feb 25 06:45:54 Boix kernel: Process apache2 (pid: 10538, threadinfo
ffff81002a9b8000, task ffff8100116ce240)
Feb 25 06:45:54 Boix kernel: Stack: 00000000000006bf ffff81000100a2a0
ffff8100369ecec0 ffff8100369ecf40
Feb 25 06:45:54 Boix kernel:        000000000000000b ffffffff8023a246
000000000000000b 000000000000000b
Feb 25 06:45:54 Boix kernel:        ffff8100116ce240 ffffffff80213f8e
Feb 25 06:45:54 Boix kernel: Call Trace: <ffffffff8023a246>{mmput+40}
<ffffffff80213f8e>{do_exit+541}
Feb 25 06:45:54 Boix kernel:        <ffffffff80246c38>{cpuset_exit+0}
<ffffffff80229962>{get_signal_to_deliver+1134}
Feb 25 06:45:54 Boix kernel:        <ffffffff802283d2>{do_signal+85}
<ffffffff80288c7c>{specific_send_sig_info+161}
Feb 25 06:45:54 Boix kernel:
<ffffffff80288eec>{force_sig_info+158}
<ffffffff8020a9d0>{do_page_fault+1974}
Feb 25 06:45:54 Boix kernel:        <ffffffff80210b3c>{unmap_region+235}
<ffffffff80219531>{remove_vma+85}
Feb 25 06:45:54 Boix kernel:        <ffffffff8025a0d0>{retint_signal+61}
Feb 25 06:45:54 Boix kernel:
Feb 25 06:45:54 Boix kernel: Code: 0f 0b 68 91 75 3f 80 c2 b3 07 59 5e
5b 5d 41 5c c3 53 48 89
Feb 25 06:45:54 Boix kernel:  <1>Fixing recursive fault but reboot is
needed!
Feb 25 06:46:40 Boix kernel: mm/memory.c:103: bad pgd
ffff81000fef7000(3000000032797067).
Feb 25 06:46:40 Boix kernel: apache2[10546]: segfault at
0000000000000000 rip 0000000000000000 rsp 00007fff21371828 error 4
Feb 25 06:46:40 Boix kernel: ----------- [cut here ] --------- [please
bite here ] ---------
Feb 25 06:46:40 Boix kernel: CPU 0
Feb 25 06:46:40 Boix kernel: Modules linked in: xt_tcpudp xt_limit
xt_state iptable_nat ip_nat ip_conntrack_ftp ip_conntrack nfnetlink
iptable_filter ip_tables x_tables ipv6 button ac battery dm_snapshot
dm_mirror dm_mod loop snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm
snd_timer snd snd_page_alloc evdev irtty_sir serio_raw shpchp
pci_hotplug parport_pc parport sir_dev i810_audio ac97_codec intel_agp
i2c_i801 pcspkr psmouse irda i2c_core soundcore floppy crc_ccitt ext3
jbd mbcache sd_mod piix ata_piix generic ide_core uhci_hcd ehci_hcd
libata scsi_mod 8139too 8139cp mii thermal processor fan
Feb 25 06:46:40 Boix kernel: Pid: 10546, comm: apache2 Not tainted
2.6.17-2-amd64 #1
Feb 25 06:46:40 Boix kernel: RIP: 0010:[<ffffffff8023814d>]
<ffffffff8023814d>{exit_mmap+226}
Feb 25 06:46:40 Boix kernel: RSP: 0000:ffff81002450fd08  EFLAGS: 00010206
Feb 25 06:46:40 Boix kernel: RAX: 0000000000000000 RBX: ffff81000100a2a0
RCX: 0000000000000070
Feb 25 06:46:40 Boix kernel: RDX: 0000000000000001 RSI: ffff8100295c34d8
RDI: 00000000000295c3
Feb 25 06:46:40 Boix kernel: RBP: 0000000000000000 R08: ffff8100011bc018
R09: 0000000000000000
Feb 25 06:46:40 Boix kernel: R10: 000000000000000b R11: ffff81002450fd10
R12: ffff810020a4f140
Feb 25 06:46:40 Boix kernel: R13: 0000000000000001 R14: ffff81002450fef8
R15: ffff81003d499698
Feb 25 06:46:40 Boix kernel: FS:  0000000000000000(0000)
GS:ffffffff80509000(0000) knlGS:0000000000000000
Feb 25 06:46:40 Boix kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Feb 25 06:46:40 Boix kernel: CR2: 0000000000000000 CR3: 0000000000201000
CR4: 00000000000006e0
Feb 25 06:46:40 Boix kernel: Process apache2 (pid: 10546, threadinfo
ffff81002450e000, task ffff81003d499100)
Feb 25 06:46:40 Boix kernel: Stack: 00000000000006ce ffff81000100a2a0
ffff810020a4f140 ffff810020a4f1c0
Feb 25 06:46:40 Boix kernel:        000000000000000b ffffffff8023a246
000000000000000b 000000000000000b
Feb 25 06:46:40 Boix kernel:        ffff81003d499100 ffffffff80213f8e
Feb 25 06:46:40 Boix kernel: Call Trace: <ffffffff8023a246>{mmput+40}
<ffffffff80213f8e>{do_exit+541}
Feb 25 06:46:40 Boix kernel:        <ffffffff80246c38>{cpuset_exit+0}
<ffffffff80229962>{get_signal_to_deliver+1134}
Feb 25 06:46:40 Boix kernel:        <ffffffff802283d2>{do_signal+85}
<ffffffff80288c7c>{specific_send_sig_info+161}
Feb 25 06:46:40 Boix kernel:
<ffffffff80288eec>{force_sig_info+158}
<ffffffff8020a9d0>{do_page_fault+1974}
Feb 25 06:46:40 Boix kernel:        <ffffffff8025e442>{thread_return+0}
<ffffffff8025a0d0>{retint_signal+61}
Feb 25 06:46:40 Boix kernel:
Feb 25 06:46:40 Boix kernel: Code: 0f 0b 68 91 75 3f 80 c2 b3 07 59 5e
5b 5d 41 5c c3 53 48 89
Feb 25 06:46:40 Boix kernel:  <1>Fixing recursive fault but reboot is
needed!

Cuando esto pasa, el sistema funciona bien, pero no hay forma de matar
el proceso que se queda pillado de apache. La única solución es
reiniciar el server.
Recuerdo que ademas de esto, en los logs de apache veía un "segmentation
fault (11)" y leí por ahí que esto podría ser debido a la memoria. La
cambie y efectivamente, desde entonces no he vuelto a ver sos errores
del kernel.
No se si ese antiguo error puede estar relacionado con este, pero ahora
todos los dias cuando rotan los logs y hace le apache restart, obtengo
estos mensajes en el error.log de apache:

[Fri Mar 09 09:01:08 2007] [warn] child process 32242 still did not
exit, sending a SIGTERM
[Fri Mar 09 09:01:10 2007] [warn] child process 32242 still did not
exit, sending a SIGTERM
[Fri Mar 09 09:01:12 2007] [warn] child process 32242 still did not
exit, sending a SIGTERM
[Fri Mar 09 09:01:14 2007] [error] child process 32242 still did not
exit, sending a SIGKILL
[Fri Mar 09 09:01:15 2007] [notice] caught SIGTERM, shutting down

Apache muere tras esto, todos los procesos quedan parados, pero se puede
arrancar apache manualmente sin ningun problema. Y todo funciona bien
hasta el siguiente logrotate (incluso haciendo un apache2 restart ahora,
no obtengo los errores)

Este es el logrotate script de apache:
# cat /etc/logrotate.d/apache2
/var/log/apache2/*.log {
       daily
       missingok
       rotate 7
       compress
       delaycompress
       notifempty
       create 640 root adm
       sharedscripts
       postrotate
               if [ -f /var/run/apache2.pid ]; then
                       /etc/init.d/apache2 restart > /dev/null
               fi
       endscript
}

No creo que sea configuración ya que tengo varios servidores y en todos
copio/pego la configuraicon. Podria ser otra vez hardware ? Tal vez algo
de las cpus ? estoy usando el kernel amd64 smp para intel 64.

Gracias.




Reply to: