[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#922416: cache.commit() doesn't release the archives lock



On Fri, Feb 15, 2019 at 8:48 PM Julian Andres Klode <jak@debian.org> wrote:
> From my understanding of the C++ code, the lock is released when the object
> is garbage collected [2] and calling shutdown (which is what the commit()
> code does) is not enough.

This is correct. But the fetcher should be collected at the end of the
SystemLock block, as it goes out of scope and thus ref count drops to 0, so
it's odd that it does not work for you.

The thing is, there are two different garbage collectors at play here, the python one and the C++ one, and it's kinda hard to reason about what should and shouldn't happen given the different policies for each of them (the python GC is very reluctant to collect things in general).

I've been able to reduce my problem to a very small script that reproduces the issue using python-apt 1.8.3 while it works fine with the old version:

==============================================
#!/usr/bin/env python
import apt
import subprocess

cache = apt.Cache()
pkg = cache["hello"]
pkg.mark_install()
cache.commit()
subprocess.check_call(["apt", "remove", "--yes", "hello"])
==============================================

Executing this with 1.8.3 I get:
------------------------------------------------------------
marga@balerion:~$ sudo python example.py 
Selecting previously unselected package hello.
(Reading database ... 392245 files and directories currently installed.)
Preparing to unpack .../hello_2.10-1+deb9u1_amd64.deb ...
Unpacking hello (2.10-1+deb9u1) ...
Processing triggers for install-info (6.5.0.dfsg.1-1) ...
Setting up hello (2.10-1+deb9u1) ...
Processing triggers for man-db (2.7.6.1-4) ...
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Could not get lock /usr/local/google/var/cache/apt/archives/lock - open (11: Resource temporarily unavailable)
E: Unable to lock directory /usr/local/google/var/cache/apt/archives/
Traceback (most recent call last):
  File "example.py", line 9, in <module>
    subprocess.check_call(["apt", "remove", "--yes", "hello"])
  File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['apt', 'remove', '--yes', 'hello']' returned non-zero exit status 100
------------------------------------------------------------

Executing it with 1.4.0~beta3 I get:
------------------------------------------------------------
marga@balerion:~$ sudo python example.py 
Selecting previously unselected package hello.
(Reading database ... 392243 files and directories currently installed.)
Preparing to unpack .../hello_2.10-1+deb9u1_amd64.deb ...
Unpacking hello (2.10-1+deb9u1) ...
Processing triggers for install-info (6.5.0.dfsg.1-1) ...
Setting up hello (2.10-1+deb9u1) ...
Processing triggers for man-db (2.7.6.1-4) ...
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages will be REMOVED:
  hello
0 upgraded, 0 newly installed, 1 to remove and 2 not upgraded.
After this operation, 279 kB disk space will be freed.
(Reading database ... 392289 files and directories currently installed.)
Removing hello (2.10-1+deb9u1) ...
Processing triggers for install-info (6.5.0.dfsg.1-1) ...
Processing triggers for man-db (2.7.6.1-4) ...
------------------------------------------------------------

Of course, my actual code is much more complex than this code, and there's a lot more scope difference (like, the cache.commit() code is in one function and the subprocess call is in a totally different function, that is not calling apt directly, but rather calling puppet that then calls apt), but the result is the same. cache.commit() should release the lock when it's done and it's not doing that.

So, this was a bug fix:

https://salsa.debian.org/apt-team/python-apt/commit/618a8e47e6ec74b21ab952da6e85ae327f87cbf7>

There might be a solution with explicit locking that also takes care of
keeping the lock open as long as needed. I guess we could do explicit locking
in both fetch_archives and commit().

I'm not sure what that bug was fixing, but it seems wrong to make the scope of the C++ object be the one that handles the release of the lock. Unlocking should be done explicitly as soon as the lock isn't needed anymore.

One option that would avoid having to undo the whole change would be to add the release of the lock to the shutdown function in the Acquire object.

But personally I think the lock should be both acquired and released explicitly by the python code rather than calling a C++ function to do it.

Thanks.

--
Cheers,
Marga

Reply to: