[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#711213: libapache2-mod-perl2: occasional core dumps after the test suite



On Friday 14 June 2013, Niko Tyni wrote:
> On Sun, Jun 09, 2013 at 11:23:01PM +0300, Niko Tyni wrote:
> > On Fri, Jun 07, 2013 at 02:23:43PM +0300, Niko Tyni wrote:
> > > I can reproduce the SIGSEGV at the end of the main test suite
> > > (#711213) on amd64.  The armel problem might well be related,
> > > as the log ends at the same point.
> > 
> > I'm somewhat further now: what happens is that
> > register_auth_provider() in modperl_util.c calls
> > 
> >  apr_pool_pre_cleanup_register(pool, NULL,
> >  cleanup_perl_global_providers);
> > 
> > once in the parent process, then another time in a child. For
> > some reason that I do not understand yet, the
> > cleanup_perl_global_providers() function resides at a different
> > memory location (with a 0x2c000 offset or so) on the second
> > time. The first location has at that point become an invalid
> > memory address, resulting in a SIGSEGV when libapr calls the
> > registered cleanup functions and jumps into the old location.
> 
> Another progress report. I now mostly understand what's happening.
> Contrary to the above, all the interesting stuff happens inside the
> parent process.
> 
> Cc'ing the apache2 maintainers; any ideas? See below.
> (The jump to an invalid address is crashing armel buildds so it's a
>  rather big problem ATM.  See #711167, where this has diverged.)
> 
> First, apache2 main() calls read_config() (from main.c:624), which
> loads all the modules. Loading mod_perl installs the pre_cleanup
> hook cleanup_perl_global_providers() as above.
> 
> Then, there's a loop starting at main.c:704 that has this comment:
> 
>         /* This is a hack until we finish the code so that it only
> reads * the config file once and just operates on the tree already
> in * memory.  rbb
>          */
> 
> and calls apr_pool_clear(pconf), which unloads the modules and
> should do all the cleanup AIUI. A bit later, at main.c:724,
> ap_read_config() is called again, and under some conditions (when
> stack limit is 'unlimited' and the number of modules is
> suitable?), mod_perl gets loaded at a different place than the
> first time. However, the earlier installed pre_cleanup hook is
> still in place, so we jump into an out-of-bounds location (where
> cleanup_perl_global_providers() used to reside) in the end when
> the cleanups are actually called.

The problem is that MP_CMD_SRV_DECLARE2(authz_provider) and 
MP_CMD_SRV_DECLARE2(authn_provider) register the cleanup against 
parms->server->process->pool which lives longer than the pconf pool 
and therefore the load time of the mod-perl shared object. It should 
probably use parms->pool (which is pconf) instead.

In general, everything mod_perl does should be undone by the 
clearing/destruction of pconf, because the the .so will be unloaded 
after that. server->process->pool can be used to store things that 
need to be preserved beyond the unloading/loading of the .so, however 
there is now also a higher level api for that 
(ap_retained_data_create). Registering a cleanup with server->process-
>pool is always bad from a module because the code may move.

Now, if there is a good reason that the above functions use server-
>process->pool, we need to figure out a way to fix that. But the 
original commit of that code has no comment with respect to the pool 
requirement. Therefore I think it may be simply a bug and you should 
test it with a cleanup against pconf, first.
 

> So I suppose mod_perl should somehow register a "module uninstall
> hook" that calls apr_pool_cleanup_run(...,
> cleanup_perl_global_providers, ...) [or apr_pool_cleanup_kill(),
> not sure] to remove the to-be-unloaded pre_cleanup hook. I haven't
> found a way to do that yet.

If you register a pool cleanup with pconf, it will be called before 
the .so is unloaded.

Cheers,
Stefan


Reply to: