Re: Strange behavior with glibc 2.3.1, malloc and threads in Sid
On Friday 24 January 2003 01:10, Marek Habersack wrote:
> Hello all,
>
> I might be doing something wrong, but I'm getting a really strange
> behavior in a program I'm writing. Here's the scenario: I've got a daemon
> program which creates one thread. The main daemon process is listening on a
> unix socket, getting data from the client, creating a request (by
> malloc'ing storage for a structure) and putting it in the circular buffer.
> The circular buffer and all variables related to it are protected by a
> mutex. The thread created at the startup blocks waiting for a condition to
> be signalled which is done by the main program right after it puts a new
> request in the queue. When that happens, the thread wakes up, gets the
> request from the queue and unlocks the mutex that protects it. So far, so
> good - but the problem is that the data gotten from the queue in the thread
> is partially corrupted: the circular buffer address is correct, the address
> of the malloc'ed request structure is correct, but the data stored in the
> structure is apparently random. The interesting thing is that the actual
> data of the request stored in the buffer is _not_ corrupted. Here's a
> fragment of a debug log from the daemon:
>
> [prog] Inserting req == 0x804f258; nto == 2; to == 0x804f300
> [prog] Putting into queue 0x804d0a0 slot 0
> [prog] Inserted req == 0x804f258; nto == 2; to == 0x804f300
>
> at this point the condition is broadcast and the thread wakes up:
>
> [thread] Checking req == 0x804f258; nto == 2; to == (nil)
> note that the address of the request is correct but the data is
> corrupted.
>
> [thread] Getting from queue 0x804d0a0 slot 0
> queue address is also correct
>
> [thread] Retrieved req == 0x804f258; nto == 2; to == 0x30613064
> random data appears in the req storage
>
> [thread] Got req == 0x804f258; nto == 1869881403; to == 0x203b3835
> even more randomness
>
> the above output from the thread is done with the lock held, there is no
> way the data could be modified in the meantime. Now we're back in the main
> program after pthread_cond_broadcast returns and the mutex is unlocked:
>
> [prog] Checking(2) req == 0x804f258; nto == 2; to == 0x804f300
> And the data is correct again.
>
> The code is not complex and I'm positive that there is no race condition
> when accessing the data. The variables and routines that operate on the
> queue (with the mutex lock held) are as follows:
>
> ----- CUT -----
> static pthread_mutex_t reqq_mutex = PTHREAD_MUTEX_INITIALIZER;
> static pthread_cond_t reqq_cond = PTHREAD_COND_INITIALIZER;
> static unsigned long reqq_size = 0;
> static unsigned long reqq_write_idx = 0;
> static unsigned long reqq_read_idx = 0;
> static vda_request **reqq = NULL;
>
> inline static int rq_empty()
> {
> return reqq_write_idx == reqq_read_idx;
> }
>
> inline static int rq_full()
> {
> return ((reqq_write_idx + 1) % reqq_size) == reqq_read_idx;
> }
>
> vda_request *rq_get()
> {
> vda_request *ret;
>
> if (rq_empty())
> return NULL;
>
> logmsg("Getting from queue %p slot %d", reqq, reqq_read_idx);
> ret = reqq[reqq_read_idx++];
> reqq_read_idx %= reqq_size;
> logmsg("Retrieved req == %p; nto == %d; to == %p", ret, ret->nto,
> ret->to);
>
> logmsg("Putting into queue %p slot %d", reqq, reqq_write_idx);
>
> reqq[reqq_write_idx++] = req;
> reqq_write_idx %= reqq_size;
> logmsg("Inserted req == %p; nto == %d; to == %p",
> reqq[reqq_read_idx], reqq[reqq_read_idx]->nto,
> reqq[reqq_read_idx]->to);
>
Not sure if that is the problem, but you are incrementing the index and thus
the second logmsg()-call is logging the content of a free slot in your queue.
: )
> Is my understanding that the
> malloc'ed memory area can be shared between threads in a process correct?
Yes, threads share a common address-space (at least here they do and in every
system I have come across), the consumer-producer design is typical in
threaded apps and relies on that.
Reply to: