It seems to me that passing "crashkernel=1024M nmi_watchdog=1 crashkernel=384M-:128M” as a kernel boot parameter makes the system way more stable (but this is based on fairly short test period - perhaps 6 times the expected failure time). Best, Mateusz