[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#981644: Reload does not reload all threads, causing multiple issues



Package: apache2
Version: 2.4.38-3+deb10u4
Severity: important

Quite recently we've noticed that an apache reload does not seem to work correctly anymore.
This gives multiple issues:
- Old certificates are served (when they are renewed and apache is reloaded) for some connections
- Old logfiles keep getting log entries
- Other config changes are not active

The server status of a server that had this problem was the following:
Server Version: Apache/2.4.38 (Debian) OpenSSL/1.1.1d
Server MPM: event
Server Built: 2020-08-25T20:08:29

Current Time: Tuesday, 02-Feb-2021 14:20:04 CET
Restart Time: Tuesday, 19-Jan-2021 14:40:04 CET
Parent Server Config. Generation: 19
Parent Server MPM Generation: 18
Server uptime: 13 days 23 hours 39 minutes 59 seconds
Server load: 0.81 0.59 0.70
Total accesses: 7354272 - Total Traffic: 290.9 GB - Total Duration: 169255120958
CPU Usage: u15165 s3136.47 cu7847.96 cs1421.78 - 2.28% CPU load
6.09 requests/sec - 252.4 kB/second - 41.5 kB/request - 23014.5 ms/request
122 requests currently being processed, 41 idle workers
Slot PID Stopping Connections Threads Async connections
total accepting busy idle writing keep-alive closing
0 94759 no (old gen) 20 yes 17 8 0 0 3
1 94760 no (old gen) 39 yes 23 2 0 0 10
2 94882 no (old gen) 26 yes 19 6 0 0 7
3 94925 no (old gen) 15 yes 6 19 0 7 2
4 94926 no (old gen) 11 yes 5 20 0 2 3
5 94983 no (old gen) 19 yes 14 11 0 3 1
6 94984 no (old gen) 21 yes 19 6 0 1 1
7 94985 no (old gen) 7 yes 2 23 0 0 4
8 94986 no (old gen) 10 yes 8 17 0 1 1
9 67090 yes (old gen) 2 no 0 0 0 0 0
10 10316 yes (old gen) 1 no 0 0 0 0 0
11 20609 yes (old gen) 16 no 0 0 0 0 0
12 14641 yes (old gen) 9 no 0 0 0 0 0
13 23297 yes (old gen) 1 no 0 0 0 0 0
14 28208 yes (old gen) 4 no 0 0 0 0 0
16 45371 yes (old gen) 4 no 0 0 0 0 0
17 54143 yes (old gen) 6 no 0 0 0 0 0
18 2036 yes (old gen) 6 no 0 0 0 0 0
20 109223 yes (old gen) 1 no 0 0 0 0 0
21 13401 yes (old gen) 5 no 0 0 0 0 0
22 14642 yes (old gen) 15 no 0 0 0 0 0
23 10506 yes (old gen) 3 no 0 0 0 0 0
24 24234 yes (old gen) 1 no 0 0 0 0 0
25 54144 yes (old gen) 2 no 0 0 0 0 0
26 52443 no 11 yes 4 21 0 0 7
27 52444 no 10 yes 5 20 0 1 2
Sum 26 15 265   122 153 0 15 41


There are multiple old threads that are old gen, but not stopping for some reason.

root      94750  0.0  0.2  24496  8804 ?        Ss   Jan19   0:50 /usr/sbin/apache2 -k start
www-data  94759  0.1  0.7 1237064 31376 ?       Sl   Jan19  35:24  \_ /usr/sbin/apache2 -k start
www-data  94760  0.2  0.7 1237828 31544 ?       Sl   Jan19  56:13  \_ /usr/sbin/apache2 -k start
www-data  94882  0.1  0.7 1236748 29864 ?       Sl   Jan19  39:26  \_ /usr/sbin/apache2 -k start
www-data  94925  0.1  0.4 1235376 19788 ?       Sl   Jan19  29:47  \_ /usr/sbin/apache2 -k start
www-data  94926  0.1  0.7 1234844 29164 ?       Sl   Jan19  36:08  \_ /usr/sbin/apache2 -k start
www-data  94983  0.0  0.7 1235876 29720 ?       Sl   Jan19  19:23  \_ /usr/sbin/apache2 -k start
www-data  94984  0.1  0.4 1236628 19276 ?       Sl   Jan19  21:13  \_ /usr/sbin/apache2 -k start
www-data  94985  0.1  0.6 1235020 27668 ?       Sl   Jan19  24:36  \_ /usr/sbin/apache2 -k start
www-data  94986  0.1  0.4 1234804 17952 ?       Sl   Jan19  24:10  \_ /usr/sbin/apache2 -k start
www-data  24234  0.0  0.0 1230416 1444 ?        Sl   Jan25   2:22  \_ /usr/sbin/apache2 -k start
www-data  23297  0.0  0.0 1230344 2528 ?        Sl   Jan29   5:04  \_ /usr/sbin/apache2 -k start
www-data   2036  0.0  0.1 1230136 4676 ?        Sl   Jan31   3:03  \_ /usr/sbin/apache2 -k start
www-data  13401  0.2  0.2 1230972 9256 ?        Sl   Feb01   4:53  \_ /usr/sbin/apache2 -k start
www-data  10506  0.1  0.2 1230240 9372 ?        Sl   Feb01   2:19  \_ /usr/sbin/apache2 -k start
www-data  28208  0.1  0.2 1230336 9304 ?        Sl   Feb01   3:15  \_ /usr/sbin/apache2 -k start
www-data  54143  0.3  0.3 1231124 15792 ?       Sl   00:00   2:41  \_ /usr/sbin/apache2 -k start
www-data  54144  0.1  0.3 1230296 14516 ?       Sl   00:00   1:27  \_ /usr/sbin/apache2 -k start
www-data 109223  0.2  0.3 1230424 15012 ?       Sl   03:05   1:56  \_ /usr/sbin/apache2 -k start
www-data  67090  0.1  0.3 1229988 14364 ?       Sl   08:00   0:35  \_ /usr/sbin/apache2 -k start
www-data  10316  0.0  0.3 1229796 14020 ?       Sl   12:04   0:02  \_ /usr/sbin/apache2 -k start
www-data  14641  0.3  0.4 1232032 17448 ?       Sl   12:11   0:31  \_ /usr/sbin/apache2 -k start
www-data  14642  0.3  0.4 1232636 18396 ?       Sl   12:11   0:24  \_ /usr/sbin/apache2 -k start
www-data  20609  0.4  0.4 1233028 18712 ?       Sl   12:30   0:33  \_ /usr/sbin/apache2 -k start
www-data  45371  0.1  0.3 1230648 15236 ?       Sl   13:55   0:02  \_ /usr/sbin/apache2 -k start
www-data  52443  0.6  0.7 1234084 28836 ?       Sl   14:16   0:03  \_ /usr/sbin/apache2 -k start
www-data  52444  0.4  0.7 1233180 28196 ?       Sl   14:16   0:02  \_ /usr/sbin/apache2 -k start

These threads keep serving the old config/ssl/.. and causes issues.

A simple apache restart fixes everything, but another reload gives again:

3 65777 no (old gen) 22 yes 4 21 0 4 12


The threads can't be stopped correctly on a restart neither:
error.log:[Tue Feb 02 14:29:45.756841 2021] [core:warn] [pid 94750:tid 139782663799936] AH00045: child process 94882 still did not exit, sending a SIGTERM
error.log:[Tue Feb 02 14:29:47.758636 2021] [core:warn] [pid 94750:tid 139782663799936] AH00045: child process 94882 still did not exit, sending a SIGTERM
error.log:[Tue Feb 02 14:29:49.760955 2021] [core:warn] [pid 94750:tid 139782663799936] AH00045: child process 94882 still did not exit, sending a SIGTERM
error.log:[Tue Feb 02 14:29:51.762668 2021] [core:error] [pid 94750:tid 139782663799936] AH00046: child process 94882 still did not exit, sending a SIGKILL



Upstream has a similar bugreport:
https://bz.apache.org/bugzilla/show_bug.cgi?id=63169

But its just strange we've hit this problem multiple times now in the last week.

Thanks
Jean-Louis


Reply to: