[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#848683: Process accounting: Using the NETLINK inface, the command TASKSTATS_CMD_GET returns -EINVAL



Source: linux
Severity: normal

Dear Ben, dear maintainers,

starting from a Debian bug report of mine, Gerlof Langeveld, developer
of system and process monitor atop¹, found two issues with process
accounting². I also started an LKML thread about these³.

[1] http://atoptool.nl/
[2] #833997 atop: process accounting does not work:
https://bugs.debian.org/833997
[3] [REGRESSION] Two issues that prevent process accounting (taskstats) from
working correctly: https://lkml.org/lkml/2016/12/19/182

I am reporting the two issues separately.

I am currently not running Debian a kernel, but Gerlof verified the issue with
Debian kernel. Pprocess accounting works with 3.16, but fails with 4.8 and
4.7 at least. I think it also fails with earlier versions, but I do not
know exactly since which version.

I reported the first issue as:

Bug#848682: process accounting sometimes does not work
https://bugs.debian.org/848682


2) When using the NETLINK inface, the command TASKSTATS_CMD_GET 
consequently returns -EINVAL.

Bug 190711 - Process accounting: Using the NETLINK inface, the command TASKSTATS_CMD_GET returns -EINVAL
https://bugzilla.kernel.org/show_bug.cgi?id=190711

The code that is used by the atopacctd daemon is based on the demo code 
'getdelays.c' that can be found in the kernel source code tree
(..../linux/Documentation/accounting/getdelays.c). Also this 'getdelays' 
program does not work any more (also -EINVAL on the same call)
with the newer kernels. I really spent a lot of time on this issue to 
get the code running (there are many places in the kernel code where
-EINVAL for this call can be given), but I did not succeed. It is really 
an incompatibility introduced by the kernel code.
It would be nice if the kernel maintainers provide a working version of 
the getdelays program in the kernel source tree.

I only experience this problem on Debian8 with a 4.8 kernel (virtual 
machine with 4 cores).
On CentOS7 with a 4.8 kernel it works fine (physical machine with 4 cores).

I will anyhow adapt atopacctd for this issue that it detects and logs 
the -EINVAL and terminates.
The current version of atopacctd keeps running which is not useful at all.


Marc Haber, maintainer of atop package, Gerlof Langeveld, developer of atop
and I are currently discussing workarounds with atop and/or systemd service
fail for the time till upstream kernels with this issues fixed are shipped
by distributions. Still it would be nice to remove those work-arounds and
have the kernel work correctly again at some time in the future.

Thanks,
Martin


-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (200, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.8.14-tp520-btrfstrim+ (SMP w/4 CPU cores; PREEMPT)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)

Reply to: