reassign 275852 apt retitle 275852 pkgTagFile::Step fails on large Release files thanks I could reproduce the bug, so I decided to dig into it. As it turned out, the bug had nothing to do with hurd-i386, although removing the lines from the Release file "fixes" the problem. Steps to reproduce: ========= START ========= greek0@orest:/tmp/code/xyz$ curl -O ftp://ftp.at.debian.org/debian/dists/sid/Release % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 34068 100 34068 0 0 36298 0 --:--:-- --:--:-- --:--:-- 81959 greek0@orest:/tmp/code/xyz$ python Python 2.3.5 (#2, Mar 26 2005, 17:32:32) [GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import apt_pkg >>> f = file("Release") >>> p = apt_pkg.ParseTagFile(f) >>> p.Step() Traceback (most recent call last): File "<stdin>", line 1, in ? SystemError: Unable to parse package file (1) ========= END ========= Debugging showed that the bug is in apt itself, and not related to python-apt directly. The error occurs somewhere in pkgTagFile::Step. Looking at the function code itself and at the error message, we can say that the first Tag.Scan() failed, Fill() succeeded, and Tag.Scan failed again. Several gdb sessions later it became clear that the while loop in pkgTagSection::Scan ran out of data to read, without hitting the double-newline. The actual cause of the problem is that sids Release file is larger then 32kb, so it doesn't fit into the buffer of pkgTagFile (whose size is determinated by the default Size argument to the classes c'tor in apt-pkg/tagfile.h). Removing the hurd-i386 lines seems to get the Release file size back under 32kb, which lets the problem disappear. The second call to Fill() that is supposed to fix this case is useless, since Start/End in pkgTagFile are not altered by the call to Tag.Scan, so the buffer is not altered. Even if we somehow set Start/End to the right values and get new data into the buffer, calling Tag.Scan on that changed buffer would be wrong, since pkgTagSection::Scan is not intended to be called more then once with changed buffers (data gained in the first run will be discarded). So if we want the Scan() call to be good for something, we have to feed it with a buffer containing the whole file, including the final double-newline. Solutions: For now we can just change the Size default argument for pkgTagFile::pkgTagFile to 64kb. However we'll have the same problem if a Release file ever grows larger than that. What we really want IMHO is a restructured pkgTagSection::Scan that supports some callback to get more data or that can at least be called multiple times accumulating data. Perhaps I can find some time this weekend to come up with a nice solution. If we really want to go that way, we'll have another ABI change though. Cheers, Greek0
Attachment:
signature.asc
Description: Digital signature