[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: merged /usr vs. symlink farms



On Sat, 2021-08-21 at 16:20 +0200, Wouter Verhelst wrote:
> On Sat, Aug 21, 2021 at 02:40:02PM +0100, Luca Boccassi wrote:
> > On Sat, 2021-08-21 at 10:26 +0200, Wouter Verhelst wrote:
> > > It bothers me that you believe "we've been doing this for a while
> > > and it didn't cause any problems, so let's just continue doing
> > > things that way even if the people who actually wrote the damn
> > > code
> > > say that path is littered with minefields and they're scared of
> > > what
> > > could happen when we finish the tranition this way" is a valid
> > > strategy. It goes against everything I was taught to do to write
> > > reliable software.
> > 
> > Many people are bothered by many things - such is life. For
> > example, I
> > am very bothered that it appears impossible to do any kind of
> > project-
> > wide innovation in Debian,
> 
>   "I don't deny the benefits.  I do think that in the current
>   implementation, the drawbacks outweigh those benefits.  That's not
> to
>   say it couldn't be done.  But if it is done, we should do it
> *right*.
>   We're Debian.  That's what we do."
> 
>   -- Colin Walters,  
> https://lists.debian.org/debian-devel/2003/06/msg00475.html
> 
> It's true that there are other distributions out there who go for the
> quick-and-dirty solution, who want the feature before the benefits,
> downsides, and risks have all been fleshed out. There's a reason why
> I'm
> not contributing to those distributions; there's a reason why I don't
> use those distributions.
> 
> "Doing it right", even if that takes time, has proven benefits.
> 
> When the RPM world implemented "multiarch", they only supported
> installing 32-bit binaries on 64-bit versions of the same
> architecture.
> They did have that feature implemented and functioning in a few
> months
> or so, but the functionality of it was very limited -- and even today
> it
> has problems, in that the way in which RPM checks that packages are
> correct has some inherent heuristics that can make mistakes. Yes,
> I've
> encountered those in practice on CentOS systems that my customers at
> the
> time really really wanted to get up and running again pretty quickly.
> 
> When Debian and Ubuntu implemented multiarch (look ma, no quotes),
> the
> time from concept to tests to implementation to public availability
> was
> *much* longer than it was in the RPM world; and while this work was
> unfinished, there was a lot of angry nagging about the lack of this
> feature and why can they do it in the RPM world you guys are idiots,
> but
> eventually it was implemented; and I think you'll agree that the dpkg
> implementation of multiarch is far superior to the RPM one: it's
> possible to use multiarch not just for compatibility with 32-bit
> versions of your 64-bit platform, as in the RPM world, but *also* for
> running arm binaries on x86 with qemu user emulation, or for
> cross-compiling, or for various other features that the RPM world can
> only dream of.

My recollection (which might be wrong, but a quick look at release
notes seems to support it with 11.04 having multiarch 2 years before
Wheezy) is that Canonical led the way with the multiarch effort in
Ubuntu, and Debian followed with lots of huffing, puffing and
grumbling.

> To get back to the point: I'm not saying we shouldn't merge /usr. We
> should; the benefits of a properly merged /usr far outweigh any
> disadvantages it may bring.  However, having an inconsistent dpkg
> database is far more serious than just "oh dpkg -S won't work as
> expected". It means dpkg isn't properly keeping track of which files
> belong to which package anymore, which means you will have issues
> with a
> package that Replaces: another, or with removing packages (especially
> with security-conscious binaries), or with diversions, or with
> alternatives, or with file conflicts, or with basically anything that
> asks dpkg about locations of files; and just dismissing it with a
> handwavy "ah well just run dpkg -S again" is so far removed from
> reality
> that it's not even funny. I think the dpkg maintainers are 100%
> correct
> to point out that that *is* a problem for which currently no viable
> solution seems to exist, and that any way forward *must* include a
> solution to that problem.
> 
> I'm not saying the solution which the dpkg maintainers are proposing
> is
> the only valid solution, but if you go and tell them "ah the real
> problems you point out are irrelevant" then You! Are! Doing! It!
> Wrong!

Again, if the magnitude of this dpkg bug was really that serious there
would be visible consequences after almost 3 years of deployments
across two distributions with who knows how many million instances, and
yet "having to run dpkg -S again" is all we can see. Where are the bug
reports? Where are the enraged users with unusable broken system and
lost data? Where are the reports of Canonical going out of business
because Ubuntu is unusable? The bug is real, nobody doubts that - it
has been filed on dpkg 20 years ago. What I am taking issues with is
the representation of its actual, real effects, and thus its severity
and the consequences for the project. There are a lot of words being
spent on how terrible and broken and unacceptable the status quo is,
and yet not a single link to a bug report.
By all means, go and fix it, make it a top priority for dpkg to sort
out, all hands on deck, whatever needed - but to demand the entire
project has to stand still, and to de-facto derail the effort put in to
catch up with the rest of the world by imposing an unworkable,
demonstrably failed solution (symlinks farm) to work around a dpkg bug
instead of fixing it internally, to me does not seem acceptable in any
way, shape or form without some real, serious evidence that the sky has
indeed fallen.

> [...]
> > The main point is that of course the insights of experts are
> > extremely
> > important, incredibly valuable and worth careful consideration,
> > especially when making decisions about an unknown future and events
> > yet
> > to unfold. But in this case these are predictions about the past, a
> > past that already exists and is lived experience for many users
> > here,
> > and for all users in Ubuntu.
> 
> What that is, is anecdotal evidence. "We've been doing X for a while
> and
> it seems to not kill everything". Cool, great, awesome data points,
> but
> not likely to convince me that there won't ever be any problems. You
> can't prove the absense of bugs by anecdotal evidence; you can only
> prove the existence of them that way.
> 
> What the dpkg maintainers are providing is analytical evidence.
> "There's
> some corner cases here which need to be catered to". You just can't
> say
> that corner cases don't happen because "anecdotal evidence". That's
> just
> not how any of that works.

"It works for me" is anecdote, bugs count is not, it is a key metric of
this industry, I am quite surprised this needs to be specified. It's
how we decide whether a release is ready or not. The fact that there is
1 (one) known, encountered and unsolved bug in 2+ years across millions
of instances  and at least two separate distros is not a one-off
anecdote, is high quality hard evidence. What you call analytical
evidence on the other hand is a fancy word for "opinion". Opinions are
useful and interesting and important, but saying "everything is broken"
when there is a surprising lack of evidence of that being the case, is
not very useful or constructive.

> [...]
> > The reality of this industry is that reliable software is an
> > oxymoron:
> > the only bug-free software is the one that doesn't exist.
> 
> I said "reliable", not "bug-free". It's impossible to write bug-free
> software, I think we can agree on that.
> 
> However, going all hand-wavy about problems pointed out by people who
> know the code intimately is not likely to improve the reliability of
> the
> resulting system.

There is no bug free software, therefore there is no fully reliable
software. But reliability and bug counts are hard metric: how much
downtime has this theoretical bug caused, how many broken-beyond-repair
deployments, how many users/customers reports, and so on. So the next
question then becomes, what is the rate of unreliability introduced by
this issue? Three years of evidence suggests very little, of a tiny
magnitude. It doesn't mean it doesn't exist, it means severity needs to
be appropriate.

Let's put it this way: if the dpkg -S root cause was unknown, I
_seriously_ doubt the bug report would get a Severity: critical and
warrant removal of dpkg (!) or stopping the Bullseye/Bookworm release
until it is solved.

-- 
Kind regards,
Luca Boccassi

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: