[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: off_t and dpkg_ar_member_get_size



On Fri, 2012-06-15 at 14:30:20 +0200, Niels Thykier wrote:
> I was considering to document the functions of ar.c when I noticed a
> possible issue in dpkg.  It seems to me that dpkg_ar_member_get_size may
> (silently) truncate a "large file" when off_t is 32 bits.  But I admit
> being uncertain here as I had difficulties finding a normative
> specification of the ar format.

Well the “normative specifications” are the ones being used and
implemented in traditional Unices. Other references besides the
Solaris man page you linked, could be the FreeBSD man page [0],
the Mac OS X man page [1], the HP-UX man page [2], or the OpenServer
man page [3].

[0] <http://www.freebsd.org/cgi/man.cgi?query=ar&sektion=5>
[1] <https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man5/ar.5.html>
[2] <http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02255322/c02255322.pdf>
[3] <http://osr507doc.sco.com/en/man/html.FP/ar.FP.html>

> As far as I can tell, off_t is signed[1] and can be 32 bits[2].
> According to the wikipedia[3] and <ar.h>, the size of an ar member is
> specified as (up to) 10 (ASCII encoded) digits.
>   The question is whether or not all 10 chars can be digits or one of
> them will always be a space (or other terminator).  I admit having
> difficulties finding a normative answer.  I did find [4], which suggests
> the file size limit for a given member is 4 GB (implying that all 10
> chars may be digits).

Yes, all field characters might get used.

>   In the case that all 10 digits are usable, then a 32 bit off_t can
> overflow when parsing the file size.

> The issue is that dpkg_ar_member_get_size does not check for this case
> and will cause the "size" to overflow.  In fact, it may even "flow back
> into positive" making it impossible for the caller to discover the that
> an overflow occured.

Indeed. But the general problem is that usage of off_t there is not
really appropriate, because if those functions end up being public
their signature will change depending on how the caller has been
configured which might cause run-time errors due to the diverging
signatures.

AFAIR I've some stashed code somewhere which switches away from off_t
to a type that's always guaranteed to be 64-bit regardless of
environment, which would fix that specific issue. I'll try to dig
that code out and push it at some point.

thanks,
guillem


Reply to: