[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: COVID [was: Re: DebConf 25 Daily announcements - 2025.07.15 - Daytrip information && DebConf Day 2]



Hi Julien,

This will be more relevant later in the email, but I want it at the top. Your probability calculations are just wrong, and I can show it with a simple example: Imagine you flip a coin 3 times in a row, and you want to know the probability that at least one flip will be heads. The probability of getting heads for any individual flip is 50%, so by your logic the probability of getting heads at least once in those 3 flips should be 1x0.5 + 1x0.5 + 1x0.5 = 1.5 or 150%. Obviously you can't have a probability higher than 100%, but you can also see for yourself that it's possible to get 3 tails in a row.

On Saturday, July 26, 2025 8:12:59 a.m. Eastern Daylight Saving Time Julien Plissonneau Duquène wrote:
> CW: wall of text, probabilities, and some code
> TL/DR: GIGO
> 
> Hi Alex,
> 
> Le 2025-07-18 22:43, Alex Lieflander a écrit :
> > 
> > Here's an example calculation of the risk with some **rough** 
> > approximations. Let's assume that:
> 
> Let me first point out some flaws in your analysis, without even 
> discussing the numbers yet.
> 
> Risk of being infected is a function of many parameters, including 
> exposure to pathogens in the air, which is itself some integration of a 
> function of the distance and time spent at that distance from a pathogen 
> emitter (but also: strain of virus, temperature, sun, relative humidity, 
> air circulation, use of mask by none/one/both persons involved, type of 
> mask, is the mask worn properly, has the bearer of a FFP2 mask a beard, 
> vaccination status, ...) and your model doesn't account for distancing 
> (which means raising the distance and minimizing the time spent close to 
> others). It's also likely that a short but intense exposure will result 
> in a worse outcome than the same amount of exposure over an extended 
> period of time, but as I could not find any data about this I will 
> ignore this possibility as well.

I completely agree that virus propagation has a large number of factors, and that I didn't consider most of them in my model. Since it would be impossible for us to know every relevant factor, the only practical way to discuss it is with a simplified model. Anyway, your argument is based on the *relative* risk between very similar situations, so a lot of the approximation errors cancel out.

> Your reasoning about probabilities is also wrong. Let's assume, to keep 
> your 5-on-1 ratio, that members of unmasked COVID-infected folks (let's 
> call them group A, 6 of them) are each emitting pathogens in ambient air 
> at 5 times the rate of mask-bearing knowingly-COVID-positive folks 
> (let's call them group B, 14 of them). Group A is then accountable for 
> 6×5 / (6×5 + 14) = 0.68182 of pathogens in ambient air, while group B is 
> for the remainder = 0.31818. Then, for a single pathogen emitted by 
> either group, the probability that it was emitted by any single member 
> of that group is a flat distribution. That is, for a member of A, 
> 0.11364, and for a member of B, 0.022727.

If you assume that a person gets infected, then your probabilities for who it was that infected them are probably correct, but that fails to take into account that a person is more likely to get infected with knowingly-positive people who wear a mask instead of self-isolating.

> Now if we remove entirely group B, the amount of pathogens that are 
> present in ambient air would go down accordingly: it would only be 
> 68.182% of what it was with group B present. Making a few more unwise 
> and unrealistic assumptions, such as the initial overall probability for 
> an attendee to become infected at this event being 6×0.05 + 14×0.01 = 
> 0.44, and the probability of becoming infected being a linear function 
> of the exposure, removing entirely group B would lower the overall 
> probability of any attendee becoming infected to 0.3.

That's not the correct way to combine probabilities. You can see for yourself in this free online textbook[1], or consider my example at the beginning. I'm going to skip the remaining parts of your email that use incorrect calculations.

> My conclusion is that this estimate of yours isn't worth anything, and 
> neither are mine in this message. Actual contagion probabilities are 
> depending on too many factors, most of them being impossible to even 
> estimate reasonably approximately, and AFAICT researchers in the field 
> do not even try to figure them out. That's why they focus on other 
> metrics such as R numbers, relative risk (or odds) factors, and cost 
> assessments (i.e. cost of not implementing a policy vs cost of 
> implementing it). We should keep that in mind while discussing this 
> policy.

I think this is your strongest point, and it brings us back to my first email: the exact probabilities don't really matter. I think we could agree on some combination of component probabilities that we both felt were reasonable and where the relative risk increase was noticeable. If it's reasonably possible that not self-isolating noticeably increases the risk of people dying when the alternative is inconvenience, I don't think that's acceptable.

Regards,
Alex



Reply to: