[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#538723: python-apt: Please allow a way to get "raw" data for a tag



Package: python-apt
Version: 0.7.11.0
Severity: wishlist
Tags: patch

There are times when it would be nice to get the raw data for a tag,
without any white space stripped off the front.  For example, see
#538376, where we would like to know if the data started with a newline.

I have attached a patch that adds a FindRaw method to TagSection.  It
uses the C++ Find(char *Name, unsigned &Pos) method to get the position
of the tag in the section, then the Get method, to get the start and
stop of the tag.  The result is the whole tag, not just the bits after
the colon with whitespace removed.  For example:

    >>> f = open('/tmp/apt_pkg.tmp', 'w')
    >>> f.write("""\
    ... Package: foo
    ... Bar:
    ...  Baz
    ... """)
    >>> f.close()
    >>> import apt_pkg; apt_pkg.init()
    >>> parser = apt_pkg.ParseTagFile(open('/tmp/apt_pkg.tmp'))
    >>> parser.Step()
    1
    >>> parser.Section.FindRaw('Package')
    'Package: foo\n'
    >>> parser.Section.FindRaw('Bar')
    'Bar:\n Baz'

This way, I can write a wrapper around TagSection that handles leading
whitespace differently than the default implementation, e.g.:

    class TagSectionWrapper(object, UserDict.DictMixin):
        """Wrap a TagSection object, using the FindRaw method instead of Find
    
        This allows us to pick which whitespace to strip off the beginning and end
        of the data, so we don't lose leading newlines.
        """
    
        def __init__(self, parser):
            self.parser = parser
    
        def keys(self):
            return self.parser.keys()
    
        def __getitem__(self, key):
            s = self.parser.FindRaw(key)
    
            if s is None:
                raise KeyError(key)
    
            data = s.partition(':')[2]
    
            # Get rid of spaces and tabs after the :, but not newlines, and strip
            # off any newline at the end of the data.
            return data.lstrip(' \t').rstrip('\n')

Thanks for considering the patch.

-- 
John Wright <jsw@debian.org>



Reply to: