Bug#473812: libc6: calloc returns non-zero memory areas when mlockall is being used
Package: libc6
Version: 2.7-5
Severity: normal
Hi!
The bug I found (if it is a bug) is very hard to reproduce for me, so
bear with me if the explanation is a bit sketchy (a glibc-malloc expert
would need to look at this in more detail). Please also note that I have
sticthed together the examples from multipel debugging runs, so the
addresses do not neccessarily match.
Findings of fact:
1. calloc returns memory areas that contain data from previous allocations
(typical example:
0x2aaab01c6fc0: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c6fc8: -56 'È' -20 'ì' 26 '\032' 5 '\005' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c6fd0: 13 '\r' 0 '\0' 0 '\0' 0 '\0' 4 '\004' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c6fd8: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c6fe0: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c6fe8: -80 '°' -82 '®' -81 '¯' 2 '\002' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c6ff0: -16 'ð' 108 'l' 28 '\034' -80 '°' -86 'ª' 42 '*' 0 '\0' 0 '\0'
0x2aaab01c6ff8: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c7000: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c7008: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7010: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7018: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7020: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7028: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7030: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7038: -48 'Ð' -22 'ê' 126 '~' 2 '\002' 0 '\0' 0 '\0' 0 '\0' 0 '\0'
0x2aaab01c7040: 112 'p' 90 'Z' 28 '\034' -80 '°' -86 'ª' 42 '*' 0 '\0' 0 '\0'
0x2aaab01c7048: 28 '\034' 0 '\0' 0 '\0' 0 '\0' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7050: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7058: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7060: 64 '@' 0 '\0' 0 '\0' 0 '\0' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7068: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7070: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7078: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
0x2aaab01c7080: 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U' 85 'U'
the 0x55's in there result from this code, executed earlier:
if (text) memset (SvPVX(text),0x55,SvLEN(text));//D
if (text) SvREFCNT_dec (text);
the second line causes the memory filled with 0x55 to be freed.
Note that the 0x55's start near a 4k boundary.
2. mallopt (M_PERTURB, <nonzero>) makes the program work
3. NOT using mlockall (MCL_CURENT | MCL_FUTURE) makes the program work
4. using valgrind makes the program work
5. using dmalloc makes the program work
So this problem only happens with the glibc malloc, when mlockall is
active and the perturb-debugging-code is NOT active. I will show why these
conditions are neccessary.
How this likely happens:
From looking throught he glibc sourcecode, I can see that calloc
sometimes does not clear the memory block, or only clears part of it,
as an optimisation:
/* Two optional cases in which clearing not necessary */
#if HAVE_MMAP
if (chunk_is_mmapped (p))
{
if (__builtin_expect (perturb_byte, 0))
MALLOC_ZERO (mem, sz);
return mem;
}
#endif
csz = chunksize(p);
#if MORECORE_CLEARS
if (perturb_byte == 0 && (p == oldtop && csz > oldtopsize)) {
/* clear only the bytes from non-freshly-sbrked memory */
csz = oldtopsize;
}
#endif
The memory block above is not an mmapped chunk (the word before it
in memory is "0xb5" which means its not from brk-managed memory, has
a size of 0xb0 bytes, has no valid prevous size prefix and is not an
mmapp chunk).
However, the second part checks for the case when an allocation
has been extended which happens when there was a call to sbrk,
extending the heap, or, for mmap-managed heaps, when there was a
call to mprotect. In both cases, calloc will only clear up to the
newly-allocated segment.
This is apparently the condition that gets triggered, and here is how:
Again, from reading the sources, it seems that glibc has the ability
to manage multiple heap arenas, one with brk/sbrk, and multiple
ones with mmap(PROT_NONE) which get "physically allocated" with
mprotect(PROT_READ|PROT_WRITE) and "physically freed" with madvise
(MADV_DONT_NEED).
In an strace (intermingled with debugging output), I see this:
mprotect(0x2aaab0135000, 155648, PROT_READ|PROT_WRITE) = 0
(a) 0x2aaab010ec00 [0x2aaab0134d60 0x2aaab015aeb6]
madvise(0x2aaab012f000, 180224, 0x4 /* MADV_??? */) = -1 EINVAL (Invalid argument)
(b) 0x2aaab013afc0 0x5555555555555555 (0 135)
Explanation:
The first mprotect "allocates" the memory used for the "text"
above (the piece of memory that later gets memset to 0x55).
The line (a) is debugging output from my program showing that
[0x2aaab0134d60..0x2aaab015aeb6] was allocated.
It is subsequently memset to 0x55 and then freed, resulting in the
madvise (from malloc/arena.c), where glibc tries to get rid of the
memory. The expectation from madvise is that the memory is cleared to
zero by the kernel. Note how the madvise call (0x4 == MADV_DONTNEED
btw.) fails, and also note that glibc completely ignores errors from
madvise (see malloc/arena.c).
In line (b) we see the address returned by calloc, and a pointer
inside the calloc'ed memory areas, which should be 0, but isn't. This
is because glibc thinks madvise cleared the memory, and the calloc
optimisation kicks in where glibc assumes that the memory is now zero,
when in fact it isn't cleared at all.
EINVAL from madvise is documented as:
EINVAL The value len is negative, start is not page-aligned, advice
is not a valid value, or the application is attempting to release
locked or shared pages (with MADV_DONTNEED).
which explains why it fails only when mlockall is being used.
Result:
mlockall is incompatible with the glibc memory allocator. this should
either be fixed or clearly documented (preferably fixed, as most
programs using mlockall are rather mission-critical, which is why they
use mlockall in the first place :)
Again, my test program is rather big, and I didn't instrument my glibc, so
the above could also be wrong, which is why a glibc expert needs to look
at it. In any case, I think the problem is relatively obvious, and not
checking the madvise return code was a bad thing in the first place.
(as a related note, I think this could also explain some of the memleaks
I experience where mallinfo shows much _less_ memory used than ps
(i.e. 400mb vs. 1.5gb), which isn't explainable by mere internal
fragmentation. this would fit into the above, as glibc might assume
the additional memory has been madvised into oblivion when the kernel
disagrees).
-- System Information:
Debian Release: 4.0
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.23-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages libc6 depends on:
ii libgcc1 1:4.3.0-1 GCC support library
libc6 recommends no packages.
-- debconf information:
glibc/restart-services:
glibc/restart-failed:
Reply to: