Re: COVID [was: Re: DebConf 25 Daily announcements - 2025.07.15 - Daytrip information && DebConf Day 2]
CW: wall of text, probabilities, and some code
TL/DR: GIGO
Hi Alex,
Le 2025-07-18 22:43, Alex Lieflander a écrit :
Here's an example calculation of the risk with some **rough**
approximations. Let's assume that:
Let me first point out some flaws in your analysis, without even
discussing the numbers yet.
Risk of being infected is a function of many parameters, including
exposure to pathogens in the air, which is itself some integration of a
function of the distance and time spent at that distance from a pathogen
emitter (but also: strain of virus, temperature, sun, relative humidity,
air circulation, use of mask by none/one/both persons involved, type of
mask, is the mask worn properly, has the bearer of a FFP2 mask a beard,
vaccination status, ...) and your model doesn't account for distancing
(which means raising the distance and minimizing the time spent close to
others). It's also likely that a short but intense exposure will result
in a worse outcome than the same amount of exposure over an extended
period of time, but as I could not find any data about this I will
ignore this possibility as well.
Your reasoning about probabilities is also wrong. Let's assume, to keep
your 5-on-1 ratio, that members of unmasked COVID-infected folks (let's
call them group A, 6 of them) are each emitting pathogens in ambient air
at 5 times the rate of mask-bearing knowingly-COVID-positive folks
(let's call them group B, 14 of them). Group A is then accountable for
6×5 / (6×5 + 14) = 0.68182 of pathogens in ambient air, while group B is
for the remainder = 0.31818. Then, for a single pathogen emitted by
either group, the probability that it was emitted by any single member
of that group is a flat distribution. That is, for a member of A,
0.11364, and for a member of B, 0.022727.
Now if we remove entirely group B, the amount of pathogens that are
present in ambient air would go down accordingly: it would only be
68.182% of what it was with group B present. Making a few more unwise
and unrealistic assumptions, such as the initial overall probability for
an attendee to become infected at this event being 6×0.05 + 14×0.01 =
0.44, and the probability of becoming infected being a linear function
of the exposure, removing entirely group B would lower the overall
probability of any attendee becoming infected to 0.3.
Your own calculation was the probability for someone to not get infected
after attending a number of successive events each with one masked or
unmasked sars-cov-2 emitter; this is not the same thing as attending a
single event with all emitters being simultaneously present, as we
assume that n times the pathogen concentration in ambient air implies n
times the contamination risk (and at some point we would reach a
saturation value, where the pathogen concentration is so high that
ending up infected is a certainty).
By chance your own calculations often end up close to my own results,
but some more fun awaits below. I would not dare to say that I mastered
that one course on probabilities that I had to take maybe some 25 years
ago, I just did better than my peers at that time, and that's not worth
much, trust me on this.
Let's now have a look at some of your numbers that are clearly off:
- 20% of people with COVID are either asymptomatic or ignore their
symptoms
I would bet that this one is way over 50%, maybe up to 80% with the
current variants. That is, for every single COVID-infected person that
has symptoms strong enough that they can't be ignored, you could have 4
other persons that have no symptoms, or symptoms so weak that they would
not even suspect that they could have COVID.
For a start, every COVID-infected people start with being asymptomatic
for a few days while already spreading the disease; then it was
confirmed by local health officials that a good fraction of
COVID-spreading persons won't experience symptoms (or symptoms so light
that they won't detect them). And then over that there are people that
will deliberately ignore the symptoms. As a concrete example of lacking
symptoms, at least one of the COVID-positive persons at this DebConf
only tested because they spent some time close to me at some point; they
didn't actually ever experience any significant symptoms, but decided to
self-isolate nonetheless.
- The probability of a particular person being infected by an unmasked
person with COVID is 5%
- The probability of a particular person being infected by a COVID
positive person wearing a mask is 1%
If this is read incorrectly (as you did) as “the probability of each
single unmasked/masked person with COVID infecting any single attendee
is 5%/1%” then (ignoring a few details, like that an infected person
cannot be infected again, or that a newly infected person can now spread
the disease after one or two days) with your other hypotheses the
overall probability for an attendee to be infected at this event would
be 6 × 5% + 14 × 1% = 44%.
And thankfully that nightmare 44% (or even 26% with your model)
contagion scenario did not actually happen, far from it AFAIK. As I
suspect that there were actually far more unsuspecting (and unmasked)
COVID-infected people present than in your hypothesis, this would mean
that the actual contagion probabilities were much (as in: several orders
of magnitude) lower.
Finally let's have some fun with fuzzing: assuming any of the rough
hypotheses above could be off by, say, half an order of magnitude (that
is, 10^0.5 = 3.16), let's run both your and my model with lower and
higher values and see which numbers we end up with.
I've published the corresponding Kotlin code [1]. Unfortunately the
Kotlin playground [2] won't run it as is as it uses full reflection
which is not allowed in the playground, but on a Debian system you can
install the kotlin package and follow the instructions in the header to
compile and run it (or just remove the fuzzing code to run the models in
the web playground).
[1]: https://salsa.debian.org/-/snippets/793
[2]: https://play.kotlinlang.org/
#0 - There are 400 Debconf attendees
#1 - 5% of Debconf attendees currently have COVID
#2 - 20% of people with COVID are either asymptomatic or ignore their
symptoms
#3 - COVID tests give a false negative 10% of the time
#4 - 100% of people who know they have COVID wear a mask at all times
#5 - 0% of unknowingly COVID-positive people wear masks
#6 - The probability of a particular person being infected by an
unmasked person with COVID is 5%
#7 - The probability of a particular person being infected by a COVID
positive person wearing a mask is 1%
I kept #0 constant, made the "upper" bound of #5 100% (would have
remained 0 otherwise), and #4 has no "upper" bound as it's already at
its max.
This yields, indeed, "interesting" results.
Depending on the combination and the model, the number of newly infected
attendees (without self-isolation) could be as low as 9 (both models,
p=0.0229) or with my model as high as 387 (p=0.98473) or 364 with yours
(p=0.95789). And the difference in probability of getting infected
introduced by self-isolation ranges all the way from 0 (in several high
contamination by unknowingly infected people combinations, e.g. by
raising #1 to #3) to 0.99211 (by lowering #1 to#3 and #7, and raising #5
and #6).
My conclusion is that this estimate of yours isn't worth anything, and
neither are mine in this message. Actual contagion probabilities are
depending on too many factors, most of them being impossible to even
estimate reasonably approximately, and AFAICT researchers in the field
do not even try to figure them out. That's why they focus on other
metrics such as R numbers, relative risk (or odds) factors, and cost
assessments (i.e. cost of not implementing a policy vs cost of
implementing it). We should keep that in mind while discussing this
policy.
Cheers,
--
Julien Plissonneau Duquène
Reply to: