[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#184495: libc6.postinst needs to restart postgres



At Thu, 13 Mar 2003 17:48:33 +0900,
GOTO Masanori wrote:
> At Wed, 12 Mar 2003 22:43:37 -0500,
> Anthony DeRobertis wrote:
> > PS: I plan to try fiddling the soname on the libnss modules; let's home 
> > we can say "Goodbye!" to all this silly restarting business.
> 
> Thanks, please check it!

Hm, from my investigation, the key is that libnss_*.so.2 is called
from libc.so.6 using dlopen().  The story with my surmise is:

  (1) If libnss_*.so is already dlopen()-ed, and if it's not dlclose()-ed,
	-> original libnss_*-2.2.5.so is used (because of remove pending)

  (2) If libnss_*.so is already dlopen()-ed, and if it's dlclose()-ed,
	-> new libnss_*-2.2.5.so is used.

  (3) If libnss_*.so is not dlopen()-ed, and if it's dlclose()-ed,
	-> new libnss_*-2.2.5.so is used.

(1) is ok, but (2)(3) are discouraged.
Look at the figure in the case of (2) and (3):

(1) This is the prior state of glibc-2.2.5.
    The L-shaped part is ABI compatibility, it's matched in this case.

  |-----------------------------------|
  |libnss_*.so.2 -> libnss_*-2.2.5.so |
  |--|                                |
  |  |  |-----------------------------|  
  |  |--|                             | 
  |libc.so.6 -> libc-2.3.1.so         |
  |-----------------------------------|

(2) A program is loaded and libc.so.6 is mapped into the memory.

  |------- program image in the memory ------|
  |                                          |
  |  |--|                                    |
  |  |  |  |-----------------------------|   |  
  |  |  |--|                             |   | 
  |  |libc.so.6 -> libc-2.3.1.so         |   |
  |  |-----------------------------------|   |
  |                                          |
  |------------------------------------------|

(3) Now libc6 is upgraded, files are replaced.  Old libc.so.6 inode is
    existed because of remove-pending, but it's vanished from the
    user.  But old libnss_*.so.2 inode is completely replaced because
    it's not dlopen()-ed.  The internal interface format is changed
    because it's major version upgrade.  But the external behavior
    or definition does not changed, so .2 is not changed.

  |-----------------------------------|
  |libnss_*.so.2 -> libnss_*-2.2.5.so |
  |           |- |                    |
  |-----------|  |  |-----------------|  
  |              |--|                 | 
  |libc.so.6 -> libc-2.3.1.so         |
  |-----------------------------------|

(4) The program calls libnss_*.so.2 using dl_open().
    The program have mapped these .so, but its internal interface 

  |-----------------------------------|
  |libnss_*.so.2 -> libnss_*-2.2.5.so |
  |           |- |                    |
  |-----------|  |  |-----------------|  
                 |--|                 
           ^
           |
           | dl_open()
           |

  |------- program image in the memory ------|
  |                                          |
  |  |--|                                    |
  |  |  |  |-----------------------------|   |  
  |  |  |--|                             |   | 
  |  |libc.so.6 -> libc-2.3.1.so         |   |
  |  |-----------------------------------|   |
  |                                          |
  |------------------------------------------|

(5) the interface is not matched, the result is undefined.


You may ask me why some daemons invoked via inetd after upgrade are
also caused the problem (like rsh, so on).  The reason is simple.
Type pstree.  They're fork()-ed from inetd: thus the same old library
glibc-2.2.5 is used.


Ulrich argued in various documents[1]:

  There are limitations, though.  Different objects might use
  different versions of the interfaces which are incompatible.  This
  can create problems is references to objects are passed between
  these components of the application. It is not easy or not possible
  to deal with these kind of situations automatically. These
  situations occur almost never and most can be resolved by relinking
  the apllication of DSO dependency in question.

(he said it about the internal versioning, but well you know it can be
applied for the case of debian)


My conclusion is "we can't fix forever".  The only things we can do is
"warn for users and recommend restart services or reboot".  If my
guess is correct, then bumping up libnss.so is meaningless.

In addition, you may have noticed the following fact: DEBIAN HAS NOT
BEEN GUARANTEED THE SAFE UPGRADE METHOD DURING THE dlopen()-used
APPLICATIONS ARE RUNNING.  Discard your dream unless you improve the
current dlopen() or think alternative (yes, it's challengeable, and
our homework :-) 

Moreover, to guarantee the safety after upgrading libc6, we needs the
all process restart which will use libnss.


I said the avoidance of this problem: bumping up libnss version.  You
know it can't be fixed (I was wrong).  Someone may ask me that bumping
up libnss version and glibc-2.3 deb depends libnss-2.3 and libnss-2.2,
which are separated from glibc.  However, I object this kind of fix.
In first, bumping up versioning should always avoid unless we really
change its behavior of external interface.  The second is this upgrade
path is occured only once: from 2.2 to 2.3.  I never want to separate
the library for such a "once" reason.  Finally, dlopen is broken
concept for the current debian upgrade system (or debian's concept is
broken), and we can't guarantee the complete safe upgrade for libc6
everytime, as I mentioned.


As I said, only my fix is "warning for users".  All bugs are "closed"
after tagging "wontfix".  The document of sarge upgrade guidance or
release notes will describe the recommendation of reboot for innocent
users.

If you find my mistake, please point out.



[1] Good Practices in Library Design, Implementation, and Maintenance,
    Ulrich Drepper, 2002-03-07, http://people.redhat.com/drepper/.

Regards,
-- gotom



Reply to: