[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#804857: linux: New feature: enable CONFIG_NO_HZ_FULL and CONFIG_RCU_NOCB_CPU/CONFIG_RCU_NOCB_CPU_NONE



Am Sat, 13 Nov 2021 00:27:12 +0100
schrieb Frederic Weisbecker <frederic@kernel.org>:

> On Thu, Nov 04, 2021 at 10:05:02PM +0100, Henning Schild wrote:
> > Am Sat, 30 Oct 2021 16:04:35 +0200
> > schrieb Salvatore Bonaccorso <carnil@debian.org>:
> >   
> > > Control: tags -1 + moreinfo
> > > 
> > > On Wed, Oct 27, 2021 at 10:16:56AM +0200, Georg Müller wrote:  
> > > > > But for other configurations it is worse:
> > > > > 
> > > > > config NO_HZ_FULL
> > > > >         bool "Full dynticks system (tickless)"
> > > > > ...
> > > > >          This is implemented at the expense of some overhead
> > > > > in user <-> kernel transitions: syscalls, exceptions and
> > > > > interrupts. Even when it's dynamically off.
> > > > > 
> > > > >          Say N.
> > > > >     
> > > > 
> > > > 
> > > > Upstream commit 176b8906 changed the description regarding
> > > > NO_HZ_FULL:
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=176b8906c399a170886ea4bad5b24763c6713d61
> > > > 
> > > >     
> > > > > By default, without passing the nohz_full parameter, this
> > > > > behaves just like NO_HZ_IDLE.
> > > > >
> > > > > If you're a distro say Y.    
> > > 
> > > While this is changed, and distros encouraged to select it,
> > > selecting this would enable both CONFIG_VIRT_CPU_ACCOUNTING_GEN
> > > and CONFIG_RCU_NOCB_CPU.
> > > 
> > > For CONFIG_VIRT_CPU_ACCOUNTING_GEN
> > > 
> > >           Select this option to enable task and CPU time
> > > accounting on full dynticks systems. This accounting is
> > > implemented by watching every kernel-user boundaries using the
> > > context tracking subsystem. The accounting is thus performed at
> > > the expense of some significant overhead.
> > > 
> > >           For now this is only useful if you are working on the
> > > full dynticks subsystem development.
> > > 
> > >           If unsure, say N.
> > > 
> > > which indicates some significant overhead.  
> > 
> > I can not answer that from the back of my head. Would have to dig as
> > well. Might get back in about two weeks if nobody else finds an
> > answer.
> > 
> > But i took the liberty to include Frederic into Cc, the author of
> > the "distro reassure" patch.
> > 
> > Not sure such a change would be allowed for bullseye (5.10) and if
> > the answer for 5.10 would be another than for i.e. 5.15
> > 
> > Reading what it is, maybe it can in fact be decoupled from
> > NO_HZ_FULL. Which would mean an upstream patch and backporting (if
> > preempt would do that, but could be considered a "performance bug"
> > i guess) 
> > > And for CONFIG_RCU_NOCB_CPU
> > > 
> > >           Use this option to reduce OS jitter for aggressive HPC
> > > or real-time workloads.  It can also be used to offload RCU
> > >           callback invocation to energy-efficient CPUs in
> > > battery-powered asymmetric multiprocessors.  The price of this
> > > reduced jitter is that the overhead of call_rcu() increases and
> > > that some workloads will incur significant increases in
> > > context-switch rates.
> > > 
> > >           This option offloads callback invocation from the set of
> > > CPUs specified at boot time by the rcu_nocbs parameter.  For each
> > >           such CPU, a kthread ("rcuox/N") will be created to
> > > invoke callbacks, where the "N" is the CPU being offloaded, and
> > > where the "x" is "p" for RCU-preempt (PREEMPTION kernels) and "s"
> > > for RCU-sched (!PREEMPTION kernels).  Nothing prevents this
> > > kthread from running on the specified CPUs, but (1) the kthreads
> > > may be preempted between each callback, and (2) affinity or
> > > cgroups can be used to force the kthreads to run on whatever set
> > > of CPUs is desired.
> > > 
> > >           Say Y here if you need reduced OS jitter, despite added
> > > overhead. Say N here if you are unsure.
> > > 
> > > Adding as well overhead.
> > > 
> > > Is this still to be considered true?  
> > 
> > probably but only for people that actively choose to use it and only
> > for the CPUs they choose. "rcu_nocbs" cmdline param, if not set
> > everything will be as it was.
> > I already indicated that in the commit message of my MR:
> > https://salsa.debian.org/kernel-team/linux/-/merge_requests/385  
> 
> Ok so the past traditional combo for a distro is:
> 
>    CONFIG_NO_HZ_IDLE=y
>    # CONFIG_RCU_NOCB_CPU is not set
>    CONFIG_TICK_CPU_ACCOUNTING=y
> 
> Then nohz_full support has been introduced which allows userspace
> tasks to run without being annoyed by tick interrupts. But ths feature
> is for extreme workloads. So we arranged for this support not to add
> additional overhead when it is not used.
> 
> This means that
> 
>    CONFIG_NO_HZ_FULL=y
>    
> which also automatically selects:
> 
>    CONFIG_RCU_NOCB_CPU=y
>    CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> 
> are not expected to bring more overhead than their traditional
> counterparts, unless kernel boot parameters such as "nohz_full="
> or "rcu_nocbs=" are passed.
> 
> So you can safely enable CONFIG_NO_HZ_FULL=y. I guess the only corner
> case is when you optimize your kernel for size and you are sure you
> won't have any user of nohz_full for your kernel, but I suspect some
> debian users, like me for example, might be interested in that
> feature.
> 
> I should clarify the help text for CONFIG_VIRT_CPU_ACCOUNTING_GEN that
> is definitely out of date.

Thanks for the clarification. Maybe the patches dealing with the docs
should be backported also to stable kernels where applicable. So other
distros could find the "if you are a distro" i.e. in a recent 5.10 or
maybe even 4.x

regards,
Henning

> Thanks.


Reply to: