Symbols with the same name from different shared objects
Hi mentors,
I'm confused about which symbol would be eventually loaded when
different shared objects provides different implementation for
the same function signature, e.g. (glibc)malloc and (jemalloc)malloc .
Debian's jemalloc package doesn't mangle the function names, i.e.
jemalloc's malloc implementation is exported as "malloc" instead
of "jemalloc_malloc" or something alike. The advantage of not mangling
the symbol names is that we can preload the shared object with ld
$ LD_PRELOAD=jemalloc.so my_program
to take advantage of jemalloc without recompilling my_program.
~ ❯❯❯ readelf -sW /usr/lib/x86_64-linux-gnu/libjemalloc.so.2 | grep malloc
76: 000000000000ca40 10040 FUNC GLOBAL DEFAULT 12 malloc
78: 00000000000151b0 12635 FUNC GLOBAL DEFAULT 12 mallocx
88: 000000000001d2e0 568 FUNC GLOBAL DEFAULT 12 malloc_usable_size
89: 00000000004bb5a8 8 OBJECT GLOBAL DEFAULT 26 malloc_message
93: 000000000001d130 97 FUNC GLOBAL DEFAULT 12 malloc_stats_print
94: 000000000029c2a0 8 OBJECT GLOBAL DEFAULT 25 __malloc_hook
111: 000000000029c4b0 8 OBJECT WEAK DEFAULT 26 malloc_conf
However at the same time, it brings confusion to me:
~/p/t/tensorflow ❯❯❯ readelf -d libtensorflow_framework.so
Dynamic section at offset 0x9d9bd8 contains 45 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libfarmhash.so.0]
0x0000000000000001 (NEEDED) Shared library: [libhighwayhash.so.0]
0x0000000000000001 (NEEDED) Shared library: [libsnappy.so.1]
0x0000000000000001 (NEEDED) Shared library: [libgif.so.7]
0x0000000000000001 (NEEDED) Shared library: [libdouble-conversion.so.1]
0x0000000000000001 (NEEDED) Shared library: [libz.so.1]
0x0000000000000001 (NEEDED) Shared library: [libprotobuf.so.17]
0x0000000000000001 (NEEDED) Shared library: [libjpeg.so.62]
0x0000000000000001 (NEEDED) Shared library: [libnsync.so.1]
0x0000000000000001 (NEEDED) Shared library: [libnsync_cpp.so.1]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libjemalloc.so.2]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
0x000000000000000e (SONAME) Library soname: [libtensorflow_framework.so.1.10]
It shows that this shared object is linked against to both
jemalloc and libc6. Both of them provide the symbol "malloc".
The question is, how can I make sure TensorFlow will alwasy load
and use jemalloc's malloc implementation instead of libc's?
I'm thinking about compile a test program and really trace
the library calls with ltrace to find out which malloc implementation
is actually used, but I'm not sure the result will be the same on
different machines and different environments. Well, is there
any other solution apart from embedding an jemalloc source code
in tensorflow source package?
Briefly speaking, an equivalent problem could be: How can I make
sure that all file synchronization calls would be nooped when my
ELF binary is linked against both libeatmydata1 and libc6?
Thanks in advance.
Reply to: