[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#748922: python-apt: TagFile doesnt close file



Hi

Quoting Michael Vogt (2014-06-06 11:15:29)
> There is a small typo in the above script. gc.collect should be gc.collect().

right. I noticed this too but it was too late and I already sent in my
bugreport. See my other submission to it for my updated results.

> I verified that the following works and does not leak fds:
> """
> class LeakTestCase(unittest.TestCase):
>     def test_leak(self):
>         # clenaup gc first
>         import gc
>         gc.collect()
>         # see what fds we have
>         fds = os.listdir("/proc/self/fd")
>         testfile = __file__
>         tagf = apt_pkg.TagFile(testfile)
>         tagf.step()
>         del tagf
>         import gc
>         gc.collect()
>         # ensure fd is closed
>         self.assertEqual(fds, os.listdir("/proc/self/fd"))
> """
> 
> Unfortunately just doing a "del tagf" is not enough, the gc call is
> needed afterwards. The reason that the del is not enough is that there
> is there is a cyclic reference from the tagf to tagf.section. The
> garbage collector breaks it, but a simple del sees a refcount >
> 0. This particular case could maybe fixed by copying the data from the
> pkgTagFile to a pkgTagSection instead of letting it operator on the
> Buffer of pkgTagFile. But that requires somework (plus additional
> memory for the copied data).

The problem is, as you also identified above, that as long as the Python object
for apt_pkg.TagFile is around, the file stays open.

I switched from apt_pkg to debian.deb822 because in my use case I want to read
from file A, modify the data and want to write back to A again. For that I
obviously should not have the original fd from reading A open when I write back
to A. One possible workaround would be to copy all of A into a StringIO and
then pass that to apt_pkg.TagFile. This would probably work but I think the
expectation is that after doing:

	mypkgs = list(apt_pkg.TagFile("Packages"))

or:

	mypkgs = []
	for pkg in apt_pkg.TagFile("Packages"):
		mypkgs.append(pkgs)

that there are no files left open. Currently in both cases, the fd is still
around. So after the last pkgTagSection is retrieved, the file should be
closed.

Maybe I find some time at some point to implement this but unfortunately the
debian.deb822 module seems to work quite well (even though it's orders of
magnitude slower).

cheers, josch


Reply to: