[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Symbols with the same name from different shared objects



Hi mentors,

I'm confused about which symbol would be eventually loaded when
different shared objects provides different implementation for
the same function signature, e.g. (glibc)malloc and (jemalloc)malloc .

Debian's jemalloc package doesn't mangle the function names, i.e.
jemalloc's malloc implementation is exported as "malloc" instead
of "jemalloc_malloc" or something alike. The advantage of not mangling
the symbol names is that we can preload the shared object with ld

 $ LD_PRELOAD=jemalloc.so my_program

to take advantage of jemalloc without recompilling my_program.

~ ❯❯❯ readelf -sW /usr/lib/x86_64-linux-gnu/libjemalloc.so.2 | grep malloc
    76: 000000000000ca40 10040 FUNC    GLOBAL DEFAULT   12 malloc
    78: 00000000000151b0 12635 FUNC    GLOBAL DEFAULT   12 mallocx
    88: 000000000001d2e0   568 FUNC    GLOBAL DEFAULT   12 malloc_usable_size
    89: 00000000004bb5a8     8 OBJECT  GLOBAL DEFAULT   26 malloc_message
    93: 000000000001d130    97 FUNC    GLOBAL DEFAULT   12 malloc_stats_print
    94: 000000000029c2a0     8 OBJECT  GLOBAL DEFAULT   25 __malloc_hook
   111: 000000000029c4b0     8 OBJECT  WEAK   DEFAULT   26 malloc_conf

However at the same time, it brings confusion to me:

~/p/t/tensorflow ❯❯❯ readelf -d libtensorflow_framework.so

Dynamic section at offset 0x9d9bd8 contains 45 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libfarmhash.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libhighwayhash.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libsnappy.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libgif.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libdouble-conversion.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libprotobuf.so.17]
 0x0000000000000001 (NEEDED)             Shared library: [libjpeg.so.62]
 0x0000000000000001 (NEEDED)             Shared library: [libnsync.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libnsync_cpp.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libjemalloc.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
 0x000000000000000e (SONAME)             Library soname: [libtensorflow_framework.so.1.10]

It shows that this shared object is linked against to both
jemalloc and libc6. Both of them provide the symbol "malloc".
The question is, how can I make sure TensorFlow will alwasy load
and use jemalloc's malloc implementation instead of libc's?

I'm thinking about compile a test program and really trace
the library calls with ltrace to find out which malloc implementation
is actually used, but I'm not sure the result will be the same on
different machines and different environments. Well, is there
any other solution apart from embedding an jemalloc source code
in tensorflow source package?

Briefly speaking, an equivalent problem could be: How can I make
sure that all file synchronization calls would be nooped when my
ELF binary is linked against both libeatmydata1 and libc6?

Thanks in advance.


Reply to: