[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Q for the Candidates: How many users?

So at the start of the week, I asked:

On Mon, Mar 22, 2010 at 05:19:20PM +1000, Anthony Towns wrote:
> Bearing in mind:
>   * www.debian.org/social_contract says Debian's "priorities are our
>     users and free software",
>   * popcon.debian.org currently reports 91,523 submissions,
>   * popcon.ubuntu.com currently reports 1,493,440 submissions, and
>   * that this is something of a trick question,
> What's your estimate of the current number of Debian users?

(I haven't seen a reply from Charles Plessy, but it's been a few days,
so moving on seems fair...)

For me, the "trick" part of the question is whether or not the machines
reporting to popcon.ubuntu.com should be counted as "Debian users",
both Stefano and Marga picked up on that, without actually stating an
opinion either way:

On Mon, Mar 22, 2010 at 09:49:47PM +0100, Stefano Zacchiroli wrote:
> I've already reported in
> a previous thread on -vote [1] about the study by Gaudenz Steinlin
> showing that our user base has been decreasing steadily since the first
> release of Ubuntu (whose users you might want to consider as Debian
> users or not).
> [1] http://lists.debian.org/debian-vote/2010/03/msg00143.html

On Wed, Mar 24, 2010 at 03:08:29PM -0300, Margarita Manterola wrote:
> Do "Debian" users include "Debian derivatives" users? :)

There's a bunch of people who use Ubuntu as their main systems these
days who've said things like "yeah, I know, I'll install Debian on it
some time, but this was just easy for now". For me, I tend to consider
them to already be using Debian systems -- I mean, they're already
using all the Debian-specific programs I've written or worked on for
Debian, so what difference does it /really/ make if it's all coloured
brown or purple instead of swirly and red? I don't see a difference,
so I count them as Debian users personally. YMMV (if it does, I'd be
interested to hear what important differences you see)

If Ubuntu systems count as Debian systems, who, in that case, counts
as Debian users as far as the social contract's concerned? The actual
people who install it, even if they've never heard of Debian? Maybe
the Ubuntu devs who are pulling source from Debian? Debian developers
nominally promise to put our "users" first in our priorities, right up
there with free software itself. Based on the popcon numbers it's
possible almost 95% of Debian's userbase get at Debian through Ubuntu.
Using Google trends or distrowatch or others would probably give you
different percentages, but it still seems pretty significant, and
maybe warrants more attention.

Stefano and Marga both wondered explicitly what the point is:

On Mon, Mar 22, 2010 at 09:49:47PM +0100, Stefano Zacchiroli wrote:
> I'm not going to give an actual estimate because I lack a significant
> amount of needed data and, frankly speaking, I don't see why the
> exercise of actually arriving at a number might be interesting as
> campaigning material (if I'm missing something fundamental about why it
> is so, please explain).

On Wed, Mar 24, 2010 at 03:08:29PM -0300, Margarita Manterola wrote:
> I think this question is indeed very tricky, and I don't see the point
> of it being posted as a question during the "campaign" period.  How
> can my estimation change your vote?

I haven't picked who I'm voting for yet, so it can't actually change
my vote... That aside, I think it lets the candidates demonstrate how
they approach a problem, which can be interesting.

  * do they actually answer the question or avoid it?

     - Wouter gave a guess based on a multiplier for the popcon data

     - Stefano declined to give an answer, but suggested augmenting the
       popcon data with download counts from mirrors and security.d.o,
       along with a reference to some related research

     - Marga declined to give an answer beyond a very conservative
       "many more than [the] 90k [from popcon]"

     - Charles hasn't answered yet

  * how do they react to the Ubuntu reference?

     - Wouter mostly ignored it, and said "I don't think it holds any
       relevance to what we do" in relation to whether popcon should be
       opt-in or opt-out

     - Stefano noted "our user base has been decreasing steadily since
       the first release of Ubuntu" and was concerned about it

     - Marga didn't mention it, except to ask if users of derivatives
       count as users of Debian

  * when asked a pretty straightforward question, do they have any
    new/deep insights? do they do any interesting analysis to come to
    a more useful conclusion than others are able to?

     - For me, those are pretty much killer features in a candidate for
       just about anything. YMMV :)

  * do they build on other people's estimates, or change their own based
    on new ideas by other people?

     - It's free software, collaboration is important. It's an election,
       collaboration is Just Not Done.

The other aspect is, how can you say "we're listening to our users?"
if we don't even have any idea how many of them there are? And
presumably being able to say "we've got $BIGNUM of users" is useful
for promotion, and "we've got $PRECISE users" might be useful for
capacity planning to some extent. I think those probably should be
things prospective DPLs should think about, at least briefly.

As far as the actual number goes, I don't have a better estimate. When
I was DPL, I happened to also be ftpmaster and I think I happened to
have access to apache logs on security.d.o and ftp.d.o for a while and
could have made an estimate similarly to Stefano's suggestion:

On Mon, Mar 22, 2010 at 09:49:47PM +0100, Stefano Zacchiroli wrote:
> Then we can look at the official mirrors logs
> (for distinct IPs regularly downloading package indexes in a given time
> window), and at the same index downloads for security.d.o (which is
> enabled by default and most likely not accessed via mirrors).

I actually thought I'd done that at some point just for kicks, but I
don't seem to be able to find what the results might have been. (Note
security.d.o resolves to different IPs in different countries these
days; and both those measures are affected by undercounting due to
proxies and overcounting due to dynamic IPs among other things)

I note Steve points out:

On Mon, Mar 22, 2010 at 03:06:37PM -0700, Steve Langasek wrote:
> http://lxer.com/module/newswire/view/123481/index.html

which cites estimates of 6-8 million Ubuntu users, for a factor of
between 4 and 5.3 compared to Ubuntu's popcon reports. If you assumed
a similar factor for Debian's popcon reports, that would give an
estimate of between 350,000 and 500,000 users. I tend to think Ubuntu
users are more likely to run popcon than Debian users, and thus that
those numbers are low, but I don't have any data to back that up or to
estimate how low. And obviously I don't have any idea where the 6-8M
estimates were pulled from, or how realistic they are.

> > Russ Allbery <rra@debian.org> writes:
> >> We're running somewhere in the neighborhood of 500 Debian stable
> >> servers. I'm afraid running popcon on them is a non-starter so far due
> >> to concerns about information exposure. When this was previously
> >> discussed on debian-devel, it became clear that we're far from the only
> >> ones in that situation.

For the record, I'm similar; I think I'm running popcon on one or two
machines out of about a dozen Debian systems. Both those anecdotes
would indicate a ratio of more than 6 systems per popcon report, but
it's hard to say. The plural of anecdote isn't data, etc.

Anyway, actual estimates were:

On Mon, Mar 22, 2010 at 09:41:43PM +0100, Wouter Verhelst wrote:
> Raw guess: around a million.

On Wed, Mar 24, 2010 at 03:08:29PM -0300, Margarita Manterola wrote:
> I certainly do believe that we have many more thank 90k users, [...]

What can we say about that?

One thing to think about is that there are just under 29k (binary)
packages in unstable (main/contrib/non-free, i386) at the moment -- if
we've got 90k users that's over 3 users per package we're maintaining,
which I think would go a long way to explaining why there seems to be
more work to be done than people to do it.

Say the average package needs three hours a month to maintain on
average -- checking against upstream updates, responding to bug
reports, following changes to policy or packaging tools, integrating
with other parts of the system etc. Say the average contributor has
one hour a week to spend on actually doing stuff (after following
flamewars on lists, chatting on IRC, understanding changes, filing
bugs/patches against other packages, doing
real work, etc). And say that for every 30 users you get one
contributor. Then the maths works out like:

    Users:           600000
    Contributors:     20000  (1 per 30 users)
    Hours per month:  80000  (4h per contrib)
    Packages:         26667  (1 per 3 hours)

Which is of course less than the actual number of packages we have,
and doesn't count either core infrastructure or Debian native

I also don't think there are 20k people contributing to Debian (even
counting DMs, webwml and alioth committers and people who help with
bug reports), which means some combination of the average package
getting even less time dedicated to it, or the average Debian
contributor spending much more time on Debian...

There are lots of ways to deal with that: smarter tools can give you
better quality for less time, more enthusiasm can give you more time
per contributor, and more encouragement can give you more
contributors; but ultimately you're still drawing all your
contributions from your pool of users.

On Mon, Mar 22, 2010 at 05:46:49PM -0700, Russ Allbery wrote:
> ... my personal goal for
> Debian is not for it to be the most popular distribution.

For the record, likewise. And yet... if you count Ubuntu systems as
Debian systems in disguise (which, like I said, I do) it *is* the most
popular Linux distribution.

My mind continues to boggle at this; but equally I'm incredibly
satisfied that the solutions we've come up with in Debian over the
past ten or more years are helping literally millions of people.
That's millions of people who'd probably otherwise be dealing with
proprietary, non-Unix systems at that. With some justification, people
often dismiss the whole concept of "build it and they will come" as
being naive wrt how the world works, and yet... we did build it, and
they did come. That's pretty remarkable, IMO.

> AJ's question, and particularly his other longer response to the question
> about disappearing DPLs, really highlight what I think are some
> disagreements between he and I about how we see Debian.  I fundamentally
> do not believe in the "grow or die" model or think that projects need to
> constantly move on to the next shiny thing.

I don't think I talked about growth for Debian in either of those
mails; personal growth certainly, but that's rather different,
particularly in the context of the personal, uh, shrinkage in the
context of no longer being DPL.

Anyway, I don't actually believe in "grow or die" -- but "grow or
stagnate" seems closer to a genuine choice, and if "stay pretty much
the same" is the goal, well, it's software -- leaving lenny up as the
final Debian release would hit that goal pretty easily, and everyone
could spend their time doing other things.

A lot of it's something of a cultural choice between maintainance and
development, IMO -- if you consider yourself a developer, the whole
point is to solve problems, and there are always new problems to
solve: not enough food? bam, agriculture, solved. oh, obesity? bugger.
If you're focussed on maintenance (like say, server admin or package
maintenance), all the changes developers do are annoying and cause
work -- you have to patch your code to deal with changes to header
paths, or add a field to specify the source format version or

I don't think you can really do without either of those roles --
without developers, apt wouldn't exist; without maintainers, people
would be too busy adding features to bother patching the security
holes in the last version. Growth (new users, new uses, better at old
uses) is definitely a developer's perspective.

> I don't believe in it for
> economies, I don't believe in it for businesses, and I don't believe in it
> for Debian.  I don't think that's a goal to pursue (or, for that matter,
> to not pursue).

Except for differences in phrasing, I think not having growth as a
goal would be a terrible idea. Imagine if Debian had stopped growing
five or ten years ago: no apt, no debconf, no debhelper, no
debootstrap, no pbuilder, no kfreebsd port, no testing suite, no
udebs/d-i. Fewer
packages, fewer architectures, and so on. I don't think you'd find
that as compelling an alternative to RHEL for your systems as Debian
currently is, even if you don't actually take advantage of everything
there. If Debian stopped growing now (or two years ago, or two years
in the future) that might be great for a while (fewer new bugs!) but
it's a safe bet that someone will have a good idea somewhere else
that'll eventually outweigh Debian's current superiority.

> I'd much rather focus on doing good work and encouraging
> and mentoring contributors and letting metrics like total user count do
> whatever they do.

User count (total or in particular areas like mass deployments versus
individual ones) is a good way of independently measuring when other
groups have compelling ideas, not to mention validating that your own
ideas are as awesome as they sounded inside your head. Subjectively,
your own ideas are almost always going to be more compelling than
anyone else's. There's a big difference between paying attention to
other people's ideas and replacing your own with them, of course.

On Wed, Mar 24, 2010 at 09:37:15AM +0100, Andreas Barth wrote:
> I need to disagree a bit: I believe in "grow or die", but grow
> doesn't need to be in quantity. So, if we get better and better (and
> our tools easier to work with, etc) we also grow but in quality. (And
> if I compare squeeze with sarge I can see lots of differences where
> looking back I always think "oh, this obviously was quite painful".)
> (Of course, this supports everything else you said.)

And to (hopefully) help clarify my views: "me too" on this.


Anthony Towns <aj@erisian.com.au>

Reply to: