We like to do things in medicine, and medicine’s big contribution to science was figuring out how best to answer the question of whether the things we do actually work. But of course things aren’t so simple, because “Does it work?” is actually two questions: “Can it work?”, i.e. will an intervention do more harm than good under ideal circumstances, and “Does it work in practice?”, i.e. will an intervention do more good than harm in usual practice.
We also like to complicate things in medicine, so the person to first delinate this distinction, Archie Cochrane of the eponymous collaboration named them efficacy and effectiveness respectively — just similar enough to cause confusion. He also added efficiency for good measure (“Is it worth it?) Fifty years later, people are still grappling with these concepts and talking over each other’s heads when discussing value in health care. Which is to say, it’s best not to use the same prefix for overlapping terms, but if you had to, “eff” is most appropriate.
The most recent example is masks. Cochrane Colaboration’s review said they didn’t “work” The paper caused an uproar and language has since been toned down, but that was the gist. for preventing respiratory infections. Now, knowing what Cochrane was all about the first question to ask is: what sense of “work” did the authors intend, and this particular group is all about effectiveness (working in “the real world”), not about efficacy (working under ideal conditions). This caused some major cognitive dissonance among the covid-19 commenters. Vox had the typical sentiment:
Furthermore, neither of those studies [included in the meta-analysis] looked directly at whether people wear masks, but instead at whether people were encouraged or told to wear masks by researchers. If telling people to wear masks doesn’t lead to reduced infections, it may be because masks just don’t work, or it could be because people don’t wear masks when they’re told, or aren’t wearing them correctly.
There’s no clear way to distinguish between those possibilities without more original research — which is not what a meta-analysis of existing work can do.
But this is the difference between ideal (you force a person to wear a mask and monitor their compliance) and typical conditions (you tell the person to wear a mask and keep your fingers crossed), and Cochrane is interested in the latter, Though of course, the chasm between ideal and typical circumstances varies by country, and some can do more than others to bring the circumstances closer to ideal, by more or les savory means. which is the one more important to policy-makers.
This is an important point: policy makers make broad choices at a population level, and thus (do? should?) care more about effectiveness. Clinicians, on the other hand, make individual recommendations for which they generally need to know both things: how would this work under ideal conditions, how does it work typically, and — if there is a large discrepancy — what should I do to make the conditions for this particular person closer to the ideal? We could discuss bringing circumstances closer to ideal at the population level as well, but you an ask the people of Australia how well that went.
The great colonoscopy debate is another good example of efficacy versus effectivness. There is no doubt that a perfectly performed colonoscopy at regular intervals will bring the possibility of having colon cancer very close to zero, i.e. the efficacy is as good as you can hope for a medical intervention. But: perfection is contingent on anatomy, behavior, and technique; “regular intervals” can be anything from every 3 months to every 10 years; and there are risks of both the endoscopy and the sedation involved, or major discomfort without the sedation. And thus you get large randomized controlled trials with “negative” results Though they do provide plenty of fodder for podcasts and blogs, so, thanks? that don’t end up changing practice.
So with all that in mind, it was… amusing? to see some top-notch mathematicians — including Nassim Taleb! — trying to extrapolate efficacy data out of a data set created to analyze effectivness. The link is to the preprint. Yaneer Bar-Yam, the paper’s first author, has a good X thread as an overivew. To be clear, this is a worthwhile contribution and I’ll read the paper in depth to see whether its methods can be applied to cases where effectiveness data is easier to come by than efficacy (i.e. most of actual clinical practice.) But it is also an example of term confusion, where efficacy and effectiveness are for the most part used interchangeably, except in the legend for Table 1 which say, and I quote:
The two by two table provides the incidence rates of interest in a study of the efficacy (trial) or effectiveness (observational study) of an intervention to reduce risk of infection from an airborne pathogen.
Which seems to imply that you measure efficacy exclusively in trials and effectiveness in observational studies, but that is just not the case (the colonoscopy RCT being the perfect example of an effectiveness trial). And of course it is a spectrum, where efficacy can only be perfectly measured in impossible-to-achieve conditions of 100% adherence and a sample which is completely representative of the population in question so any clinical trial is “tainted” with effectiveness, though of course the further down you are on the Phase 1 to Phase 4 rollercoaster the closer you are to 100% effectivness.
I wonder how much less ill will there would be if the authors on either side realized they were talking about different things. The same amount, most likely, but one could hope…
Update: Not two seconds after I posted this, a JAMA Network Open article titled “Masks During Pandemics Caused by Respiratory Pathogens—Evidence and Implications for Action” popped into my timeline and wouldn’t you know it, it also uses efficacy and effectiveness interchangeably, as a matter of style. This is in a peer-reviewed publication, mind you. They shouldn’t have bothered.