[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Do no harm: Data loss from new (bookworm2trixie) discard=async default



Hi,

Sorry for not preempting the backlog sooner.

An ultra-brief history: many SSDs including various Samsungs, and if I
remember correctly many drives with old SandForce controllers have
broken discard=async.  This was a big issue back in 2011-2014, and in
some (many?) cases it was a data loss risk.

Linux-6.2 started enabling discard=async by default (at least for
btrfs), and deductively this appears to necessarily harm many users of
at least pre2011-to-2014 SSDs.  Does Linux-6.12.x, for trixie, have
sufficient quirk coverage to make the new default safe, and fall to back
to discard=sync for affected hardware?  Alternatively, has our kernel
been patched to maintain bookworm's 6.2.x behaviour of discard=sync?

Security conscious users maintain that it presents a security risk when
a filesystem issues discards to the underlying LUKS layer.  Are we going
to start doing this by default for trixie, or are we still going to
block it at the dm-crypt layer?

I'm most concerned about the btrfs-specific case, where "mount -o
discard" is significantly riskier than running fstrim; it's a major
contributing factor to those old "btrfs ate my data" stories, and the
primary motivation for my Debian involvement is safe defaults for btrfs.

Or do we ship a default configuration that provides the best performance
for recent (five years) systems, and that is probably safe most
mainstream systems?  It looks like that's where we are now.  In this
case, are release notes really enough for what sounds like a data loss
risk?

Best,
Nicholas

Attachment: signature.asc
Description: PGP signature


Reply to: