On Wed, Jun 04, 2014 at 12:58:21AM +1000, Stuart Prescott wrote: > python-apt maintainers: do you think it's reasonable to change apt_pkg.TagFile > (presumably by changing libapt-pkg) to split paragraphs not only on blank > lines but also on whitespace-only lines? For reference, policy §5.1 permits > such control files with pretty rubbery language: > > The paragraphs are separated by empty lines. Parsers may accept lines > consisting solely of spaces and tabs as paragraph separators, but control > files should use empty lines. (not python-, but apt "proper" ;) ) My reading is actually quiet different: First sentence. Period. Parsers may be more relaxed, but do not expect it: control files should use empty lines. (aka: "should" not in a "as Mylord pleases", but in a "if you don't have a damn good reason to do otherwise, follow my lead" as "Non-conformance with … should … will generally be considered a bug" in §1.1) > I tend to err on the side of the parser being lax and the generator being > strict, which makes me think that both deb822.iter_paragraphs and > apt_pkg.TagFile should split on these whitespace-only lines. Being lax usually costs performance. The TagFile parser in libapt deals by default only with machine generated files so it tends to be rather strict to not waste time accounting for things which never happen in practice. Looking at 'apt-cache stats' on my machine (with arguable many sources) shows more than half a million sections are being parsed, so "just ignoring some spaces" could have very visible effects for me. There is an exception to this, the preferences file(s), which have a deb822 format as well. The pkgTagSection is here therefore subclassed to allow commented lines (aka: lines starting with #) before and after each section (in between just works "by accident" now that I see the code – again as I wrote that ~4 years ago as one of my first patches…). I tell you this because comments are allowed in a control file by policy, so using pristine pkgTagFile here will have interesting effects. (like multiline fields split by comments) A non-empty should-be-empty line might be the smallest of your problems… We will have to work on that for the preferences file anyway, and reading control file ourself isn't unheard of as well, but that isn't going to be provided by stock pkgTagFile I assume. I could imagine a pkgTagSection which can be told how relaxed it should be (or at least allows more plugin code than at the moment), but that code doesn't exist yet and I have no idea how the python layer looks like on top of that… Best regards David Kalnischkies
Attachment:
signature.asc
Description: Digital signature