[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: COVID [was: Re: DebConf 25 Daily announcements - 2025.07.15 - Daytrip information && DebConf Day 2]



CW: wall of text, probabilities, and some code
TL/DR: GIGO

Hi Alex,

Le 2025-07-18 22:43, Alex Lieflander a écrit :

Here's an example calculation of the risk with some **rough** approximations. Let's assume that:

Let me first point out some flaws in your analysis, without even discussing the numbers yet.

Risk of being infected is a function of many parameters, including exposure to pathogens in the air, which is itself some integration of a function of the distance and time spent at that distance from a pathogen emitter (but also: strain of virus, temperature, sun, relative humidity, air circulation, use of mask by none/one/both persons involved, type of mask, is the mask worn properly, has the bearer of a FFP2 mask a beard, vaccination status, ...) and your model doesn't account for distancing (which means raising the distance and minimizing the time spent close to others). It's also likely that a short but intense exposure will result in a worse outcome than the same amount of exposure over an extended period of time, but as I could not find any data about this I will ignore this possibility as well.

Your reasoning about probabilities is also wrong. Let's assume, to keep your 5-on-1 ratio, that members of unmasked COVID-infected folks (let's call them group A, 6 of them) are each emitting pathogens in ambient air at 5 times the rate of mask-bearing knowingly-COVID-positive folks (let's call them group B, 14 of them). Group A is then accountable for 6×5 / (6×5 + 14) = 0.68182 of pathogens in ambient air, while group B is for the remainder = 0.31818. Then, for a single pathogen emitted by either group, the probability that it was emitted by any single member of that group is a flat distribution. That is, for a member of A, 0.11364, and for a member of B, 0.022727.

Now if we remove entirely group B, the amount of pathogens that are present in ambient air would go down accordingly: it would only be 68.182% of what it was with group B present. Making a few more unwise and unrealistic assumptions, such as the initial overall probability for an attendee to become infected at this event being 6×0.05 + 14×0.01 = 0.44, and the probability of becoming infected being a linear function of the exposure, removing entirely group B would lower the overall probability of any attendee becoming infected to 0.3.

Your own calculation was the probability for someone to not get infected after attending a number of successive events each with one masked or unmasked sars-cov-2 emitter; this is not the same thing as attending a single event with all emitters being simultaneously present, as we assume that n times the pathogen concentration in ambient air implies n times the contamination risk (and at some point we would reach a saturation value, where the pathogen concentration is so high that ending up infected is a certainty).

By chance your own calculations often end up close to my own results, but some more fun awaits below. I would not dare to say that I mastered that one course on probabilities that I had to take maybe some 25 years ago, I just did better than my peers at that time, and that's not worth much, trust me on this.

Let's now have a look at some of your numbers that are clearly off:

- 20% of people with COVID are either asymptomatic or ignore their symptoms

I would bet that this one is way over 50%, maybe up to 80% with the current variants. That is, for every single COVID-infected person that has symptoms strong enough that they can't be ignored, you could have 4 other persons that have no symptoms, or symptoms so weak that they would not even suspect that they could have COVID.

For a start, every COVID-infected people start with being asymptomatic for a few days while already spreading the disease; then it was confirmed by local health officials that a good fraction of COVID-spreading persons won't experience symptoms (or symptoms so light that they won't detect them). And then over that there are people that will deliberately ignore the symptoms. As a concrete example of lacking symptoms, at least one of the COVID-positive persons at this DebConf only tested because they spent some time close to me at some point; they didn't actually ever experience any significant symptoms, but decided to self-isolate nonetheless.

- The probability of a particular person being infected by an unmasked person with COVID is 5% - The probability of a particular person being infected by a COVID positive person wearing a mask is 1%

If this is read incorrectly (as you did) as “the probability of each single unmasked/masked person with COVID infecting any single attendee is 5%/1%” then (ignoring a few details, like that an infected person cannot be infected again, or that a newly infected person can now spread the disease after one or two days) with your other hypotheses the overall probability for an attendee to be infected at this event would be 6 × 5% + 14 × 1% = 44%.

And thankfully that nightmare 44% (or even 26% with your model) contagion scenario did not actually happen, far from it AFAIK. As I suspect that there were actually far more unsuspecting (and unmasked) COVID-infected people present than in your hypothesis, this would mean that the actual contagion probabilities were much (as in: several orders of magnitude) lower.

Finally let's have some fun with fuzzing: assuming any of the rough hypotheses above could be off by, say, half an order of magnitude (that is, 10^0.5 = 3.16), let's run both your and my model with lower and higher values and see which numbers we end up with.

I've published the corresponding Kotlin code [1]. Unfortunately the Kotlin playground [2] won't run it as is as it uses full reflection which is not allowed in the playground, but on a Debian system you can install the kotlin package and follow the instructions in the header to compile and run it (or just remove the fuzzing code to run the models in the web playground).

[1]: https://salsa.debian.org/-/snippets/793
[2]: https://play.kotlinlang.org/

#0 - There are 400 Debconf attendees
#1 - 5% of Debconf attendees currently have COVID
#2 - 20% of people with COVID are either asymptomatic or ignore their symptoms
#3 - COVID tests give a false negative 10% of the time
#4 - 100% of people who know they have COVID wear a mask at all times
#5 - 0% of unknowingly COVID-positive people wear masks
#6 - The probability of a particular person being infected by an unmasked person with COVID is 5% #7 - The probability of a particular person being infected by a COVID positive person wearing a mask is 1%

I kept #0 constant, made the "upper" bound of #5 100% (would have remained 0 otherwise), and #4 has no "upper" bound as it's already at its max.

This yields, indeed, "interesting" results.

Depending on the combination and the model, the number of newly infected attendees (without self-isolation) could be as low as 9 (both models, p=0.0229) or with my model as high as 387 (p=0.98473) or 364 with yours (p=0.95789). And the difference in probability of getting infected introduced by self-isolation ranges all the way from 0 (in several high contamination by unknowingly infected people combinations, e.g. by raising #1 to #3) to 0.99211 (by lowering #1 to#3 and #7, and raising #5 and #6).

My conclusion is that this estimate of yours isn't worth anything, and neither are mine in this message. Actual contagion probabilities are depending on too many factors, most of them being impossible to even estimate reasonably approximately, and AFAICT researchers in the field do not even try to figure them out. That's why they focus on other metrics such as R numbers, relative risk (or odds) factors, and cost assessments (i.e. cost of not implementing a policy vs cost of implementing it). We should keep that in mind while discussing this policy.

Cheers,

--
Julien Plissonneau Duquène


Reply to: