Re: Bug#1080956: SaunaFS versus LizardFS versus MooseFS
Hello Dmitry!
So, as I've noted before, I am paid to work on SaunaFS. Which means I
definitely have a bias, but I would like to make an objection on one of
the points here. Prepare for a long rant about MooseFS below.
I'll agree on a few points. First, MooseFS is indeed more ahead as of
this time feature-wise (not sure about performance, maybe something
someone could do a benchmark about). I would personally really want
automatic tiered storage myself, but this can somewhat be achieved
manually by using the goals system to set and change the goals on
directories/files. EC has been around in LizardFS for years, so it's
more like MooseFS became on par with it :). Of course, MooseFS also had
EC for years, but it only became open source I think last year/this
year?
Second, MooseFS does indeed have a community larger than SaunaFS, but
the same can be said of other DFS's, say Ceph (I heard you are not a big
fan of it?). Just because something has a smaller community does not
warrant exclusion in my opinion. But I can see that for people it will
be easier to find support and help for it, and might prefer it because
of that (outside of asking for paid support).
** RANT BEGIN **
But I definitely object to the point it makes more sense to contribute
to MooseFS, because there's been hardly any contribution to MooseFS from
the outside, apart from a few small typo/bug fixes. I think that's
because nobody wants to work with that codebase and deal with their only
developer.
As far as I know, I'm fairly confident that there's really only one
developer that is working on MooseFS, and has been like that for about
20 years (don't know how old MooseFS is exactly, but I guess that's
around its entire existence). I base that confidence on one source who
worked directly with that developer (who long ago worked for the same
company supporting MooseFS) and me simply working with his code, because
despite the refactoring the LizardFS people did, I think about 33-50% is
still his code (and boy does it stick out like a sore thumb).
I'd describe the code charitably as "the work of a genius", because
actually understanding the code requires that same genius who wrote it.
I swear, I've almost never seen him use structs for anything but linked
lists, all the places structs make sense to use are just one-character
variables declared at the start of the function. Or how instead of using
data structures like hash maps for things like open files (there's a lot
of them if you create a lot of files at the same time), linked lists are
used (and then wondering why creating a lot of files is slow). Maybe
this was changed in later MooseFS versions, but it doesn't spark
confidence in me that the code he writes now is better. From what I've
seen, there's been a few improvements, but the code isn't any more
understandable.
Perhaps my favorite piece of code is this masterpiece of Python code,
where it's conditionally printing out HTML/CSS/JS for about 8000 lines.
That was in the codebase since at least 2009, only removed this year
when Python dropped support for the cgi module (and the new GUI was
partially at least developed by someone else, so progress):
https://github.com/moosefs/moosefs/blob/v3.0.118/mfsscripts/mfscli.py.in
(BTW, SaunaFS/LizardFS also had this, as it inherited it from MooseFS
when LizardFS forked. But SaunaFS is also getting rid of it, as there is
now a fully rewritten version of it available)
I think one can see why the LizardFS developers focused on aggressively
refactoring the code of MooseFS. It's because otherwise it could not be
worked on at all. The only criticism regarding this I have is why they
decided to slowly transition to C++ at the same time. That made it just
more complicated and harder. But LizardFS was in the end easier to fork
for SaunaFS because:
1) There was a lot of refactoring already done. While the LizardFS code
wasn't great (chunkserver did extraneous writes of blocks to the disk,
causing a lot of slowdown in it), it was at least understandable on how
it works. 2) They wrote actual tests for it and those tests were open
source. Either the MooseFS developer rarely creates bugs (I guess that
could come from 20 years of experience developing a DFS by yourself),
they have some great internal testing tools, or the best QA team on
earth. But none of that matters when you are forking, as you can't use
any of them.
I kinda feel bad for the management of the company behind MooseFS. If
that one developer leaves, they would pretty much lose all the
institutional knowledge of the codebase and would be in the same
position as LizardFS was when it forked.
If anybody wants to fork MooseFS, my advice would be: Don't. Instead,
create functionality tests of MooseFS, then write the code from scratch
to conform to said tests.
** RANT END **
So to summarize my rant: I see it as pretty much impossible to
technically contribute to MooseFS, even if the developer wanted it. And
once that developer stops working on MooseFS, I think that will be the
end of it's development mostly, unless the rest of the MooseFS team can
turn it around.
After two years of working on SaunaFS, I'd be more than willing to help
new people contribute features/fixes to it. I don't know if the same can
be said of MooseFS.
But as you said, you haven't a single problem with MooseFS, and I
believe you when you say it's reliable. In the end, only the user
experience matters, it doesn't matter how the sausage is made ;). And
that's something I've focused on as well being the SaunaFS maintainer,
cautiously introducing new features/refactors, even doing personal QA on
my own personal cluster (while also internally dogfooding it before even
making a release candidate) every release to make sure the scenario you
faced with LizardFS never happens again.
So I don't buy the argument that MooseFS is purely better.
Feature-wise, sure, but that's getting narrower every day (For example,
there will be TLS support in the two months or so, something MooseFS has
never had in it's entire existance).
Reliability-wise, that's something we are trying to repair after the
LizardFS fiasco, and we've not had major issues for the two years we are
out (there were a couple of releases we did pull, but these were very
situational and they were patched quickly, unlike LizardFS were the bug
you described is still unfixed to this day).
Development health and code wise? No, I'd argue it is much worse.
Best regards,
Urmas Rist
On Wed Dec 24, 2025 at 1:28 PM EET, Dmitry Smirnov wrote:
> Dear all,
>
> Let me add some context with few insights regarding packaging of SaunaFS.
>
> Years ago I introduced LizardFS into Debian, maintained it throughout
> entirety of its life span (while being actively involved upstream),
> then requested its removal (in favour of MooseFS).
>
> LizardFS began as a fork of MooseFS with few added features (erasure coding,
> master fail-over). LizardFS attracted a small community of users while its
> developers undertaken aggressive code re-factoring (and conversion to C++)
> which eventually got out of control, leading to grave/catastrophic bug
> that caused massive loss of data (which burned me badly).
>
> Shortly after that, development of LizardFS stagnated, then stopped, with
> team of developers seemingly being dissolved, doing no public work for
> number of years. LizardFS community ceased to exist, with most of its
> former users moving to MooseFS or other technologies.
>
> In the beginning, LizardFS had feature parity with Moosefs. But active
> development of MooseFS continued in the meantime, with regular releases
> as well as major release that added erasure coding and extraordinary
> Storage Classes feature for tiered storage and precise control for data
> placement.
>
> Feature-wise, MooseFS is certainly much ahead of a relatively new LizardFS
> fork. While SaunaFS is yet to prove itself, I personally vouch for
> reliability of MooseFS that I recommend wholeheartedly.
>
> MooseFS have been developed with competence. It is a trustworthy technology
> with active community and good online support. MooseFS appears to be better
> through and through.
>
> Considering state of affairs with LizardFS, I find it hard to justify SaunaFS
> for inclusion to Debian, especially given the fact that packaging of superior
> MooseFS (introduced and maintained by yours truly) is already available.
>
> IMHO it would make more sense to assist with maintenance of MooseFS as well
> as to collaborate with its upstream developers rather than trying to revive
> obsolete technology with no community behind it.
Reply to: