Bug#1024811: Re: Bug#1024811: linux: /proc/[pid]/stat unparsable
Donald Buczek dixit:
>No, Escaping would break existing programs which parse the line by
>searching for the ')' from the right.
Huh? No!
The format is "(" + string + ") " after all, and only the string
part would get escaped.
The only visible change would be that programs containing a
whitespace character (and, ideally, a ‘(’) in their name would
be escaped, which are these that are currently broken anyway.
And perhaps backslashes, if you decide to encode unambiguous,
but given the field length limit, I don’t think that was ever
a goal (both because I suspect this file was intended to be
used to get a quick overview and therefore deliberately shortens
and because the full info is available elsewhere), so no need to
encode unambiguously.
>If some documentation suggests, that you can just parse it with scanf,
>the documentation should be corrected/improved instead.
No. Someone recently did a survey, and most code in existence splits
by whitespace. Fix the kernel bug instead.
>Are you referring to proc(5) "The fields, in order, with their proper
>scanf(3) format specifiers, are listed below" [1] or something else?
Yes.
>The referenced manual page is wrong in regard to the length, too. There
>is no 16 character limit to the field, because it can contain a
>workqueue task name, too:
Probably used to be cut off after 16. Go fix that in the manpage
then. But do fix the encoding kernel-side.
>In fact, if you start escaping now you might also break programs which
>rely on the current 64 character limit.
Just cut off at the end then, like I suspect was done at 16 bytes
initially.
Or strip whitespace and closing parenthesis if present instead
of encoding them, or replace them with a question mark.
bye,
//mirabilos
--
“It is inappropriate to require that a time represented as
seconds since the Epoch precisely represent the number of
seconds between the referenced time and the Epoch.”
-- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2
Reply to: