Re: nbd, nbdkit, loopback mounts and memory management

We use a modified nbdkit ("cbdkit") in production with kernel nbd to run a loopback device for our CloudBD Disks (www.cloudbd.io). We have observed kernel deadlocks when using nbdkit due to memory pressure and have had to take steps such as preallocating and mlocking all nbdkit and plugin memory. Although our plugin is a loopback over a unix domain socket, we still need to send data back out over the network to object storage. This creates additional memory pressure on the system. Preallocating and locking the memory in the nbd loopback device write path and reserving enough kernel memory for the outgoing network writes we are able to avoid the kernel deadlock even under extremely heavy load on low memory systems.

I am able to reproduce a kernel deadlock by putting a low memory system under heavy load with our DEBUG level logging to the highest setting. This causes rsyslog to try to allocate memory while we are in the loopback device write path that is trying to clear out buffer cache and this can wedge the system.

- Shaun McDowell

On Fri, Feb 15, 2019 at 6:01 PM Pavel Machek <pavel@ucw.cz> wrote:

On Fri 2019-02-15 22:41:26, Richard W.M. Jones wrote:
> On Fri, Feb 15, 2019 at 08:19:54PM +0100, Pavel Machek wrote:
> > Hi!
> >
> > I watched fosdem talk about
> > nbdkit... https://www.youtube.com/watch?v=9E5A608xJG0 . Nice. But word
> > of warning: I'm not sure using it read-write on localhost is safe.
> >
> > In particular, user application could create a lot of dirty data
> > quickly. If there's not enough memory for nbdkit (or nbd-client or
> > nbd-server), you might get a deadlock.
>
> Thanks for the kind words about the talk. I've added Wouter Verhelst
> & the NBD mailing list to CC. Although I did the talk because the
> subject is interesting, how I actually use nbdkit / NBD is to talk to
> qemu and that's where I have most experience and where we (Red Hat)
> use it in production systems.
>
> However in January I spent a lot of time exercising the NBD loop-mount
> + nbdkit case using fio in order to find contention / bottlenecks in
> our use of threads and locks. I didn't notice any particular problems
> then, but it's possible my testing wasn't thorough enough. Or that
> fio only creates small numbers of dirty pages (because of locality in
> its access patterns I guess?)
>
> When you say it's not safe, what could happen? What would we observe
> if it was going wrong?

I'm not saying I've seen it happen, or have a test. But my
understanding of memory management says it could deadlock... if nbd
tried allocating memory while memory was "full" of dirty data.

Dunno, something like ... take 1GB block device with 1GB RAM
machine. Create memory pressure so that nbdkit (etc) is dropped from
memory. Then quickly make all the data on the block device dirty.

I believe that scenario is something that can not happen on system
without NBD in loopback configuration.

Situation may be made worse if nbdkit needs to allocate memory due for
compression buffers or something like that.

Best regards,

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html