Bug#814980: Additional information regarding bug #814980

To: "814980@bugs.debian.org" <814980@bugs.debian.org>
Subject: Bug#814980: Additional information regarding bug #814980
From: "Burke, Max" <mburke@ea.com>
Date: Thu, 25 Aug 2016 21:12:20 +0000
Message-id: <[🔎] BL2PR07MB2404396400A7B15DD94F0466B7ED0@BL2PR07MB2404.namprd07.prod.outlook.com>
Reply-to: "Burke, Max" <mburke@ea.com>, 814980@bugs.debian.org

Hello,

I believe I am seeing this same bug (814980), but with Apache for Windows. I do not think this is necessarily a Windows specific bug. Here's the information I was able to dig up. It looks like one of Apache's memory allocators is getting stuck (apr-util/include/apr_misc/apr_rmm.c) in the while(next) loop below because, at least in the case I am observing, blk->next holds the value of next so the loop does not advance.

static apr_rmm_off_t find_block_of_size(apr_rmm_t *rmm, apr_size_t size)
{
    apr_rmm_off_t next = rmm->base->firstfree;
    apr_rmm_off_t best = 0;
    apr_rmm_off_t bestsize = 0;

    while (next) {
        struct rmm_block_t *blk = (rmm_block_t*)((char*)rmm->base + next);

        if (blk->size == size)
            return next;

        if (blk->size >= size) {
            /* XXX: sub optimal algorithm 
             * We need the most thorough best-fit logic, since we can
             * never grow our rmm, we are SOL when we hit the wall.
             */
            if (!bestsize || (blk->size < bestsize)) {
                bestsize = blk->size;
                best = next;
            }
        }

        next = blk->next;
    }

    if (bestsize > RMM_BLOCK_SIZE + size) {
        struct rmm_block_t *blk = (rmm_block_t*)((char*)rmm->base + best);
        struct rmm_block_t *new = (rmm_block_t*)((char*)rmm->base + best + size);

        new->size = blk->size - size;
        new->next = blk->next;
        new->prev = best;

        blk->size = size;
        blk->next = best + size;

        if (new->next) {
            blk = (rmm_block_t*)((char*)rmm->base + new->next);
            blk->prev = best + size;
        }
    }

    return best;
}

The debugger shows a number of threads are all in this same function at the same time with the same data. 

This function is, in theory, guarded by a lock. However, the lock type is a union of multiple kinds of lock (ie: cross process, mutex, read/write, or a null lock type) (apr-util/include/apr_anylock.h):

/** Structure that may contain any APR lock type */
typedef struct apr_anylock_t {
    /** Indicates what type of lock is in lock */
    enum tm_lock {
        apr_anylock_none,           /**< None */
        apr_anylock_procmutex,      /**< Process-based */
        apr_anylock_threadmutex,    /**< Thread-based */
        apr_anylock_readlock,       /**< Read lock */
        apr_anylock_writelock       /**< Write lock */
    } type;
    /** Union of all possible APR locks */
    union apr_anylock_u_t {
        apr_proc_mutex_t *pm;       /**< Process mutex */
#if APR_HAS_THREADS
        apr_thread_mutex_t *tm;     /**< Thread mutex */
        apr_thread_rwlock_t *rw;    /**< Read-write lock */
#endif
    } lock;
} apr_anylock_t;

Looking at the lock object's innards in the debugger it seems like the lock it's using is the null type, which makes sense because the LDAP cache code doesn't pass in a lock: (httpd/modules/ldap/util_ldap_cache.c):

        /* This will create a rmm "handler" to get into the shared memory area */
        result = apr_rmm_init(&st->cache_rmm, NULL,
                              apr_shm_baseaddr_get(st->cache_shm), size,
                              st->pool);

and if one isn't passed in, it initializes it to a null lock:

APU_DECLARE(apr_status_t) apr_rmm_init(apr_rmm_t **rmm, apr_anylock_t *lock, 
                                       void *base, apr_size_t size,
                                       apr_pool_t *p)
{
    apr_status_t rv;
    rmm_block_t *blk;
    apr_anylock_t nulllock;
    
    if (!lock) {
        nulllock.type = apr_anylock_none;
        nulllock.lock.pm = NULL;
        lock = &nulllock;
    }

I would think the apr_rmm_init() call in the LDAP cache should pass in a lock, or avoid using the apr_rmm memory system.

-Max

Reply to:

Prev by Date: Bug#835041: Backport Apache #56241 to Wheezy
Next by Date: Processed: bug 828236 is forwarded to https://bz.apache.org/bugzilla/show_bug.cgi?id=60061
Previous by thread: Bug#835041: Backport Apache #56241 to Wheezy
Next by thread: Processed: bug 828236 is forwarded to https://bz.apache.org/bugzilla/show_bug.cgi?id=60061
Index(es):
- Date
- Thread