Bug#748922: python-apt: TagFile doesnt close file
Hi
Quoting Michael Vogt (2014-06-06 11:15:29)
> There is a small typo in the above script. gc.collect should be gc.collect().
right. I noticed this too but it was too late and I already sent in my
bugreport. See my other submission to it for my updated results.
> I verified that the following works and does not leak fds:
> """
> class LeakTestCase(unittest.TestCase):
> def test_leak(self):
> # clenaup gc first
> import gc
> gc.collect()
> # see what fds we have
> fds = os.listdir("/proc/self/fd")
> testfile = __file__
> tagf = apt_pkg.TagFile(testfile)
> tagf.step()
> del tagf
> import gc
> gc.collect()
> # ensure fd is closed
> self.assertEqual(fds, os.listdir("/proc/self/fd"))
> """
>
> Unfortunately just doing a "del tagf" is not enough, the gc call is
> needed afterwards. The reason that the del is not enough is that there
> is there is a cyclic reference from the tagf to tagf.section. The
> garbage collector breaks it, but a simple del sees a refcount >
> 0. This particular case could maybe fixed by copying the data from the
> pkgTagFile to a pkgTagSection instead of letting it operator on the
> Buffer of pkgTagFile. But that requires somework (plus additional
> memory for the copied data).
The problem is, as you also identified above, that as long as the Python object
for apt_pkg.TagFile is around, the file stays open.
I switched from apt_pkg to debian.deb822 because in my use case I want to read
from file A, modify the data and want to write back to A again. For that I
obviously should not have the original fd from reading A open when I write back
to A. One possible workaround would be to copy all of A into a StringIO and
then pass that to apt_pkg.TagFile. This would probably work but I think the
expectation is that after doing:
mypkgs = list(apt_pkg.TagFile("Packages"))
or:
mypkgs = []
for pkg in apt_pkg.TagFile("Packages"):
mypkgs.append(pkgs)
that there are no files left open. Currently in both cases, the fd is still
around. So after the last pkgTagSection is retrieved, the file should be
closed.
Maybe I find some time at some point to implement this but unfortunately the
debian.deb822 module seems to work quite well (even though it's orders of
magnitude slower).
cheers, josch
Reply to: