Posts in: medicine

Wednesday links, science, medicine and pop psychology

  • Sasha Gusev: Thoughts on AI in academia. They are good ones. Extra points for leading me to an article from Sam Kriss in Harper’s Magazine about some magnificently agentic stupid people spending away their youth in San Francisco.
  • Ruxandra Teslo: Manufacturing requirements are killing cell and gene therapy. The FDA wants companies to make at least two batches of product at the highest standard of manufacturing before approving it for commercial use. You should know this before starting your clinical program, especially if you have manufacturing that’s expensive, so maybe make two small batches instead of a big one? Just a thought. Separately, none of this would be an issue if there was momentum towards considering cell & gene therapy more of a blood bank/cell processing thing than a commercial drug. But then you couldn’t charge as much, could you?
  • Regan Penaluna for Nautilus: Lessons in Chemistry, 19th-Century Style. Frustratingly, it takes Penaluna four paragraphs to mention the full name of Jane Marcet, the woman whose book “Conversations on Chemistry” inspired Michael Faraday — first paragraph mention! — to pursue science. The headline is also too broad: this was 1806, pre-Victorian times and barely 19th century. An extraordinary woman. Also: I want that book.
  • Kristen French, also for Nautilus: Solving Feynman’s Formula for Eating Well, Parking Your Car, and Finding a Mate. How Feynman’s scribbles in a Thai restaurant lead to a paper in the Proceedings of the National Academy of Sciences, with mathematical proof of a common-sense inkling: more possible choices and more time should lead to more experimentation in order to discover “the best” of anything.
  • Adam Mastroianni: Stop eating Lady Gaga’s Oreos. One of Mastroianni’s best, hinged on one key insight: Americans used to see themselves as temporarily embarrassed millionaires. High fortune now being so out of reach for many that it is simply unimaginable, they now see themselves as temporarily anonymous celebrities instead, which is why we have become more tolerant of celebrities hawking Oreos and less tolerant of billionaires. Also, good confirmation that I did not imagine the period when artists were trying very hard not to be labeled as sellouts.

Perusing Sequoia Capital's healthcare company database, a word comes to mind

Repugnant.

Abby Care — the listing is alphabetical so it sits at the top — “empowers families to deliver exceptional care”. More specifically, it “trains family members to become paid caregivers for loved ones with disabilities or special needs” and “provides training and community support to deliver better care at lower cost”.

Mysteriously, it also brags about accepting major health insurance providers. Piecing together the fluffy prose on their website and their simple 3-step process [Note: Step 1: Get certified with Abby Care through no-cost training. Step 2: Begin care enrollment (with Abby care support for employment, payroll and administrative coordination… Wait, payroll?) Step 3: Deliver quality care. ] it seems that Abby Care wants to be the Uber of home health aid for people who have a family member in need. The proposition is that, since you are already doing all these things for your loved one, you may as well train to do it even better (great!) but also bill your loved one’s medical insurance for it for the “care” you “deliver” (wait, what?) and let Abby Care take a cut (ick!)

The website is also full of community, whether it’s “family-to-family connection and support”, “navigation of community programs and resources” or a statement that “Abby Care brings families together through shared experiences, practical support, and ongoing connection”. Yes yes, you can make this abomination into a community in the same way you can run dirty tap water through reverse osmosis and add back electrolytes to make it taste like a crisp mountain spring. But only one of those comes free, conditional on 1) it existing and 2) you getting to it; RO is significantly easier to obtain provided you have the electricity to run the machine, money for regularly replacing the remineralization filter, and tolerance for wasting four gallons of water for every one you drink.

Mr. Market has a knack for finding an inherently good thing people do out of altruism, sense of obligation or sheer humanity, then putting a price tag on it, taxing it and adding on a 15% service fee to boot — just look at what happened to Airbnb. The only reason some galaxy-level brain at Sequoia isn’t funding a child care scheme similar to Abby Care is that there is no equivalent in health insurance for the care of children. Otherwise I have no doubt that there would be a platform — it is always a platform — for parents to take care of each other’s progeny in return for some meager returns which the platform owners will garnish, laughing all the way to the Silicon Valley Bank.

The monetization of everyday interactions is not new — it first became salient to me after reading the 2019 book Capitalism, Alone by Branko Milanović, which the Wall Street Journal described as “an implausibly dystopian vision of global capitalism’s future.” I am sure the finance whizzes at the WSJ would not find this entire thing repugnant, but I truly wonder about the prevailing opinion among everyone else.


🎙️ One of these days I will have a nice thing to write about something, but until then here is an anti-recommendation for "Acquired", a podcast for people who are not me

Since the Acquired podcast was so heavily recommended by John Gruber and Ben Thompson on Dithering, I thought I would give it a try. Half an hour into an almost four-hour episode about Epic EMR, I am not impressed.

The style is in the uncanny valley between spontaneous and fully scripted, where the two co-hosts simulate a dialogue in the style of NotebookLM. I could actually tolerate that part: “The Rest Is History” podcast has the same shtick and I’ve listened to quite a few episodes. What I can’t stand is sloppiness about facts (ahem), and this is what one of the hosts uttered as an introduction to why medical records have become so important:

The important thing to realize from all this is that the vast majority of patients do not feel the cost of their healthcare directly in the United States. Those costs are so laundered through private insurance companies and Medicare and Medicaid, that most people think about any given health encounter as being paid for by someone else, by a part of some system.

If you’re trying to unpack how did our healthcare become 18% of GDP versus 11% of the UK GDP or a staggering 6% of Singapore’s GDP, albeit at a much smaller scale, why are we 18%? A big thing you have to understand is psychologically every healthcare encounter is that the system is paying for it. I’m paying into the system, the system is paying for it, but what does it cost? What do I actually pay? It’s a big abstraction.

This is, of course, hogwash. Citizen of the UK and Singapore are even further removed from knowing how much their health care costs — Singapore has a mandatory government-funded “base” coverage with voluntary private coverage on top, and the UK of course has the NHS which is funded straight from the budget — so anyone trying to blame the American healthcare disaster on patients, the implication being that they are spending money like drunken sailors because they don’t know what anything costs, is trying to pull wool over your eyes.

I would also push back strongly against the first paragraph. No one — underlined, bold, in all caps NO ONE — in the United States of America things that their health encounter is being paid by someone else, because the deductibles are high, so are the co-pays, the insurance premiums are staring at your from the pay stub, and at the end of each encounter with the “health” “care” “system” comes the Explanation of Benefits. I mean, who is this podcast even for?

Oh.

“Acquired tells the definitive history & strategy of the world’s greatest companies” says the home page, not realizing that there is a difference between history and hagiography. These guys are doing the latter, to much back-patting from the buy, borrow, die class which eats that stuff up. I’ll pass.


OpenEvidence is a technological Trojan horse at the gates of clinical practice

Go to openevidence.com and you will see, right under the elegant logo and a free text box prompting you to ask a medical question, an immodest tag line: “America’s Official Medical Knowledge Platform”. The boast sits above an enviable lineup of official partners: The New England Journal of Medicine, Journal of the American Medical Association, National Comprehensive Cancer Network, Cochrane Systematic Reviews. If you were a clinician in need of information these would be the first places to go, [Note: Save, perhaps, for a few journals in the JAMA network, and I write this as someone who has published in and reviewed for JAMA. ] but now there is no need because OpenEvidence will do it for you, for free and — unlike those poor community doctors whose practices can’t afford an NEJM subscription — with full access to all those journals.

Their About page is even more effusive. “Our mission is to help doctors save lives and improve patient care.” Great! It goes on:

This year, more than 100 million Americans will be treated by a clinician using OpenEvidence. As a product, OpenEvidence is an AI copilot for doctors that helps them make high-stakes decisions at the point of care. OpenEvidence is the most widely used medical AI among verified U.S. clinicians. To date, we have supported over 200 million AI-powered clinical consultations from U.S. doctors and other frontline clinicians.

In a remarkably short period of time, OpenEvidence has become the default operating system of medical knowledge in the United States.

Underneath lies the Team, laden with Harvard and MIT affiliations, and long list of medical advisors ranging from Mayo, Hopkins and Mass General staff to prominent YouTubers.

It was a rather obvious idea, to create a specialized LLM chatbot which restricts its data sources to medical literature only, so when I first saw OpenEvidence, the way it presented itself (partnership with NEJM and JAMA, MIT affiliation) and the price (free for everyone with an NPI) I was pleasantly surprised that these institutions came together for the common good, to create our generation’s PubMed.

Hardy har har.

Scroll further down and under another immodest headline — “Supported by the Best” — sit the logos of Sequoia Capital, Kleiner Perkins, Blackstone, Andreessen Horowitz, Nvidia, Google Ventures and the like. Not listed on the website because there is no “Investor relations” page — that may spook the clinicians! — is the financial history. Earlier this year it raised $250 million in a Series D round at $12 billion valuation. Just three months before that it raised $200 million at $6 billion valuation. In total, it has received close to $700 million in funding over its four years of existence.

Yes, OpenEvidence, “the default operating system of medical knowledge in the United States” (their words, emphasis included), is a tech startup zipping through the first phase of enshittification, i.e. attracting users with a high-quality offering. I would argue that even the “high-quality offering” is a bit of a crock, but we’ll come back to that shortly. Let’s, for the purposes of this paragraph, go with the premise that the unique thing that OE provides is the “artificial intelligence” portion. Well, from what I understand the company relies on OpenAI, Anthropic and others for the actual compute and if that is the case they are one-step removed from the absolute carnage whose genesis Ed Zitron and others have been diligently chronicling. The default operating system of American medicine is an earnings miss away from the blue screen of death.

I won’t cry for the billionaires involved. I will, however, mourn the opportunity cost of so many smart physicians and programmers on their medical and technical teams spending their time on point-one-percenter enrichment instead of truly building our generation’s PubMed. It would not even require compute! The true value of OE is the curated collection and unrestricted access to peer-reviewed journals, treatment guidelines, and systematic reviews, supplements and all. Let me google all that — or better yet, look it up on Kagi — and I will not care at all for the LLM-generated veneer glued onto man-made knowledge. But good luck having NEJM, JAMA et al. open their vaults without the VC-backed carrot of (I suspect) God knows how many millions of dollars for access rights combined with the FOMO stick that Anthropic and OpenAI’s PR teams have been so diligently whittling.

Trigger warning for an LLM-sounding phrase: the mounds of AI slop added to OE search results aren’t just wasteful, they are dangerous. Back in the Triassic era when shmucks like yours truly were nursing their middle-finger calluses writing progress notes by hand you knew that every part of that note contained useful knowledge. With the electronic medical record mandate — thanks, Obama — much of it became an unreadable mix of computer-generated charts and copypasta; you had to look at the end of the note to find actual human thought, whether it is in the Assessment and Plan or the Attending Addendum section. Well, I can report from the front lines that much of the time even that one meager paragraph has become a copy/paste job carrying with it that distinct LLM waft.

I am not against using LLMs for progress notes — we have been using human scribes for decades to write up the facts of the doctor-patient encounter. But those are costly and your rural primary care physician certainly won’t have one, so why not delegate that work to AI? The assessment and plan, however, are where you infuse those facts with meaning and then act on them, which is the entire purpose of the physician’s job. Writing is thinking and millions of US medical professionals have decided to delegate the one job they have to AI while keeping all the moral and legal responsibility, reverse-centauring themselves willingly and with eyes wide open.

This may seem like a “the food is horrible and the portions are too small” joke — have I not just wrote that the whole thing will soon be dead? If you are a physician who values their brain and doesn’t copy off a clanker why should you care if either start relying on them and then get a rug-pull? Three reasons:

  • Expectation-setting: those who copy will need 15 minutes per encounter, then 10, then 5, continuing to ingest slop and regurgitate it over patient notes even as it gets increasingly bad from more and more expensive compute.
  • Asbestos exposure: as in, AI is the asbestos we are shoveling into the walls of our society, only the asbestos here is in the form of regurgitated slop we are putting into patient medical records. That, too, will take our descendants some time to dig out, although human life span being what it is it should be less than a whole generation.
  • Thinking of the kids: some of my own highest yield learning moments were reading the attending addendum on my note, or the dictation of a particularly skilled specialist’s consult note; will the incoming generations of medical students and residents have the same opportunity?

So if your mission truly was to help doctors save lives and you weren’t a greedy son of a bitch would you not have made a non-profit to achieve that goal? It may not have been as slick as something coming out of Silicon Valley, but it would also not have the risk of blowing up if the financial winds turn and the funding flywheel stops spinning. After all, there have been many attempts to replace the government-funded Medline/PubMed combo, but none of them were that much (if at all) better to justify the cost.


Correcting a handful of misconceptions, inaccuracies and falsehoods in "The blood cancer that became solvable" by Ruxandra Teslo and Amol Punjabi

As a fan of Ruxandra Teslo’s writing — 25 mentions to date! — it pains me to write that her recent article in “Works in Progress”, for which she shares the byline with Open Evidence chief product officer Amol Punjabi, had me wince about a half-dozen times too many to ignore. Worse yet, I agree with the thrust of the article: that China is eating America’s lunch in cell and gene therapy and will soon come for the rest of biomedicine. Heck, that is one of the main reasons I am soon going back to clinical medicine, seeing too many business flights to Shanghai and Beijing time zone Zoom meetings in my future [Note: In case you were wondering, the correct number of each for me personally is exactly zero. ] had I continued down the industry path.

Alas, Teslo, Punjabi and whichever LLM did their research had cut too many corners on the way to the largely appropriate destination. Let’s count a few of them.

The old, cheap generic chemotherapy drugs still rock. A combination of two or three chemotherapy drugs developed in the 1970s and 80s is still the gold standard for treating testicular cancer. Chemotherapy has tamed what was once a pancreatic cancer-level death sentence into a diagnosis that doesn’t even have a “stage IV”. Speaking of pancreatic cancer: daraxonrasib, the K-Ras inhibitor which Teslo just a few weeks ago deemed a turning point, [Note: That article was, however, still of much higher quality than the one discussed here. Or maybe I don’t know as much about K-Ras and pancreatic cancer as I do about CAR T-cells and myeloma. Could this be a case of reverse Gell-Mann amnesia on my end? ] doesn’t even come close to what bleomycin, etoposide and cisplatin did for testicular. I guess they don’t make turning points like they used to.

The transformation of oncology started long before mid-2010s. The article paints a simplistic picture of oncology’s history. First there was surgery, followed by, in the 1890s and the discovery of X-rays, radiation therapy. Blunt and unsophisticated chemotherapy which relies purely on the cancer cells’ propensity to divide faster than non-cancer cells came in the 1940s and 1950s. Finally, in the mid-2010s, after we learned more about the molecular biology of cancer [Note: I guess that, if you wanted to show off your academic status, you should use a 10-dollar word like “underpinnings” here instead of the plain, grade school level “biology”, much in the same way you should find and replace every “use” with “utilize”. But that would, of course, make you a 10-dollar ass. ] came “immunotherapies” by which the article largely means CAR T-cell therapies in general and one in particular, ciltacabtagene autoleucel, known to friends as cilta-cel (generic name) or Carvykti (brand name and the one used throughout the essay; this is telling).

Look, I am no fan of Siddharta Mukherjee’s but at least his history of cancer, The Emperor of All Maladies got the sequence right. Rituximab, a monoclonal antibody which some still consider the original immunotherapy — after all, it acts mainly by siccing patients’ own immune cells and complement towards the target lymphoma and leukemia cells — was approved in 1997 after a Phase 1 trial that started in 1994. Trastuzumab, another monoclonal, was approved for Her2-positive breast cancer in 1998. Imatinib, a revolutionary wonder-drug which inspired dozens of me-too small molecule competitors, had its first-in-human study in 1998 and was approved just three years later, in 2001. Each needed just 3 years to get from the very first patient being dosed to FDA approval; remember that factoid it may become relevant in a few paragraphs. These were actual cures for lethal, aggressive cancers. But if the narrative is that China has accelerated the development of the first true advancement in cancer cures since the advent of chemotherapy let’s just pretend they don’t exist.

Myeloma treatment is not as brutal as painted. Although, of course, everything is in the eyes of the beholder, or rather the mind of the patient having to suffer through it. I do take issue with all three of the specific side effects that the essay highlights, as well as the time burden of myeloma is described. To wit:

Patients come in and out of the clinic for injections, take pills at home and undergo repeated blood tests, living according to a calendar organized around treatment days and recovery days. They also have to contend with the side effects of the medications. Dexamethasone can produce a sleepless agitation followed by a physical and emotional crash. Bortezomib often damages peripheral nerves, causing tingling and a burning pain in the hands. Daratumumab often leads to immune suppression, leaving patients more vulnerable to infections.

Dexamethasone is given in bursts and, thanks to the decidedly non-industry funded trials led by S. Vincent Rajkumar, at a much lower dose than before, minimizing these sorts of side effects. Similarly, bortezomib is now given less frequently and in different ways (under the skin instead of intravenously) to minimize nerve damage. And if you think immunosuppression is bad with daratumumab, well, try wiping out every antibody-producing cell in your body then waiting until you can get all of them back, and yes that includes needing to receive all your childhood vaccines again.

Separately, repeated blood tests are a sine qua non of multiple myeloma management, or really of any cancer management, even after a “cure”. If we aren’t monitoring for recurrence of the primary disease we are fussing over other cancers which may or may not be the result of the treatment itself, or of a person’s general propensity to have cancer. [Note: In fact, two biggest risk factors for having cancer, other than a genetic mutation/hereditary syndrome, are age and prior personal history of cancer. ] And yes, that goes even for patients whose CAR-T treatment leads to durable complete remissions. Especially with CAR-T treatments which are known to cause cancer.

Speaking of which, cilta-cel/Carvykti is not a walk in the park either. Cytokine release syndrome (CRS) and Immune effector cell-associated neurotoxicity syndrome (ICANS) are two particularly nice side effects of all conventional CAR-T therapies, Carvykti included. They are frequent and severe enough that most patients need to be treated in the hospital and be within driving distance for the next four weeks. Many end up being admitted to the critical care unit. CAR-Ts that target BCMA, like Carvykti, also cause profound immunosuppression (vide supra) and require patients to repeat their childhood vaccination series. Carvykti, however, is in a league of its own as on top of all that it can also cause Parkinsonism. This is not to throw shade at CAR-Ts, they truly are revolutionary. But let’s not condemn other myeloma treatments for their toxicity when the alternative is worse in some ways, about the same in others.

BCMA CAR-Ts are, for most patients with multiple myeloma, not a cure. The essay cites 12-month results of the CARTITUDE-1 trial, where 76% of participants who received the cells [Note: But not including those enrolled to the trial who never got them, whether because they couldn’t be made, they were too sick to get them, or just plain died. This is how you play the denominator game. ] had no signs of myeloma at 12 months. Quote:

But what happened afterwards is perhaps even more striking: in the Abecma progression free survival curve, the line falls continuously. By contrast, in Carvykti, the line starts to plateau. Extended follow-up at five years confirmed that 33 percent of Carvykti patients remained disease-free.

This is false: there is no plateau. Figure 2 of the NEJM article describing these results has some numbers at the bottom not included in the Works in Progress essay. These represent the “number at risk” — participants who were still available for follow-up at a given time point; others have either progressed, resulting in an unwanted “drop” in the curve, or have not yet been followed for that long [Note: There are actually more reason for a participant to be marked as a “tick” without dropping the line, i.e. to be “censored”, some more nefarious than others. For a good primer on this “informative censoring” see, for example, this article ] and are marked with a triangle here though more commonly they are merely a tick. The “plateau” is an artifact of too few participants getting to 24 months, only 9. It completely disappears in extended follow-up, with the curve continuing its descent at and past 24 months in Figure 2A, all the way to 60 months where a cluster of vertical tick marks precedes yet another mirage of a plateau, again with only a handful of patients being at risk. Let’s pray it ain’t so but I suspect that, if we were to continue following these participants to 10 years, the curve will continue going down and down and down.

You could make the same story about artefact plateaus about daratumumab as well. It, too, has been pushed up all the way to first-line treatment and even before overt disease; concerns about longer follow-up needed for what is usually a slow-burning disease remain. Compare and contrast to imatinib in CML in this recent essay from Vinay Prasad, who concludes with:

There is progress in both diseases but more in CML. CML is more clearly a success story. There is much room for progress in myeloma. Myeloma is not yet curative, sadly. Presenting survival over time is misleading and masks more complicated narratives.

Carvykti’s approval timeline was not gobsmackingly fast. Most misleading is the side-to-side comparison of the Chinese cilta-cel and its American predecessor, ide-cel development pathway. Ide-cel includes the development of CAR T-cells in general (1989–2012) and the first BCMA targeting proof-of-concept (2013). Cilta-cel emerged from Zeus’s head in 2014, like it didn’t require both CAR-Ts to be developed and BCMA to be validated as a target. The tag line of the figure is that “China’s BCMA CAR-T reached FDA approval just 11 months after the US, despite starting decades later.” Hogwash.

Note the development timelines: ide-cel’s first-in-human study started in 2014. [Note: I should know: I was there! Funnily enough I was the in-house fellow on call on the days when two of the first 3 participants received their cells and had the honor of escorting them to the intensive care unit that very night. Both had both their myeloma and all of bone marrow wiped out in the process. ] It received FDA approval in 2021, for a total of 7 years of clinical trials. Cilta-cel’s first-in-human was in 2016 with a 2022 approval; 6 years. Let’s finish up our mini-mental test: how long did it take for the FDA to approve rituximab, trastuzumab and imatinib, from the first patient dosed?


These are only the highlights, but going much deeper would be nitpicking. I don’t know whether this amount of laxity with the truth was intentional, but the essay is almost as misleading as a Seattle lady’s GPS, taking her straight onto light rail tracks. Once there, you can only go in two directions: forward, towards loosening up regulations to match China’s Wild West, or backwards, tightening up regulator requirements for Chinese assets and trials and punishing companies for doing business there. [Note: See how I ties going “forward” with less regulation and “backward” with more. This can easily be flipped to portray less regulation as going backwards, but I leave doing that in full as a fun exercise for you, dear Reader. ] What happened to going sideways? Diagonally? Up or down? What if it the time to approve revolutionary cancer treatments has doubled because the follow-ups aren’t as revolutionary? And then get drowned out further by the me-toos and the ghost drugs which make much better competitors in the biotech beauty pageant, where whom you know and where you came from is more important than the increasingly pliant, malleable and quicksand-appearing ground truth?

But sure. China.


If it looks like a press release and reads like a press release, why is it being sold as a government report?

Doc in a Box from Alex Tabarrok links to an official state government document, from the Utah Department of Commerce. The document is titled “Key Statistics on the Doctronic Pilot Program” but reads more like a bulleted press release, full of percentages without a denominator, begging for a flow chart. Press releases are like that because you typically won’t add images — although this one randomly selected from today does indeed include it along with the full abstract submitted to the ASCO annual meeting, and good for them — but more importantly because you want to pick the best possible picture-perfect view of your shiny spotless data elephant without also acknowledging that it has a rear end, a bunch of flies buzzing around, smells a bit rank. Does your elephant not have an ass, Utah? Or did you just copy/paste what Doctronic — a startup whose wonky web page doesn’t even work — sent you?

Screenshot of the Doctronic homepage with the message We hit a technical snag. Our engine encountered an issue; we are resolving it now. Doctronic. We have hit a technical snag. Go to Homepage to hit it again.

So how many patients could they have evaluated? This article in JAMA Forum says that “[p]hysicians hired by Doctronic will review the AI’s output for the first 250 patients before the system takes any action and will review the next 1000 patients retrospectively after the AI agent begins acting autonomously.” Are the key statistics from the first 250? The very first bullet point in the press release summary document says that the program is still in Phase One and that “the number of patients so far is limited”, so I guess not. Is it 100 at least? Surely they wouldn’t use a percentage as high as 97 if there were fewer than that involved. Except that as low as 30 will give you a percent roundable to 97. So, 30 to 249?

Why am I being so pedantic? Well, these techniques are par for the course in biotech world but coming from a state agency make me think there is a bit too much enthusiasm for it, coming from a government source. Compare and contrast to the shellacking LLMs got in this report from the Office of the Auditor General of Ontario, which reviewed AI Scribe functionality from 20 vendors. Their report even has absolute numbers in it! These state government officials should realize that they are prime targets for flim flam merchants and should behave accordingly.

Note that I am not against the idea in general. The project’s goal is in fact quite noble: there is no reason why plain ol’ machine learning shouldn’t be able to suss out majority of refill requests for chronic medications and flag patients who haven’t had their bloodwork or diabetic foot assessments done, or who’ve had abnormal office blood pressure readings at prior visits. Having that easy refill option available would mean a patient coming in for an in-person visit for what should be “only” prescription refills is even more of a signal that something else may be amiss, even if the patient can’t or won’t verbalize it. So yes, LLM refills, bring ’em on. Doctronic’s end-goal of actual autonomous Shoggoths putting on white coats and replacing MDs, PAs, NPs and other credentialed humans… not so much.


Wednesday links, science and medical


Tuesday links bonanza

Your life’s goal should be to become the most improbable person you can be. Your path, your character, your life, should be the most unlikely, the most unexpected, the least predictable version you can make. Improbable lives have fewer competitors, more unique rewards, and are harder to replace with AIs, since AIs run on the predictable. This is true whether you favor traditional humanist directions or work on a frontier.

This is a nice preamble to a bit of personal news I can finally share: I will soon be going back [Note: It is a qualified “back”, as I have never actually practiced medicine full time, being either in training, doing clinical research as my main job, or being out of clinic altogether save for a few hours a week doing charity work. ] to the practice of clinical medicine. This week is in fact the last in my current position, which had been a magnificent experience but was going, as the careful reader of this blog would have already noted, in a direction not entirely suited to my preferred lifestyle and more importantly — let’s not sugarcoat it — values and beliefs. Onwards and upwards!

Whittaker, who is the president of the Signal Foundation (as in the app), had this to say about venture capital back in 2023:

Venture capital looks at valuations and growth, not necessarily at profit or revenue. So you don’t actually have to invest in technology that works, or that even makes a profit, you simply have to have a narrative that is compelling enough to float those valuations. So you see this repetitive and exhausting hype cycle as a feature in this industry. A couple of years ago, you would have been asking me about the metaverse, then last year, you would have asked me about Web3 and crypto, and for each of these inflection points there’s an Andreessen Horowitz manifesto.

It’s not simply that one piece of technology is overhyped, it’s that hype is a necessary ingredient of the current business ecosystem of the tech industry. We should examine how often the financial incentive for hype is rewarded without any real social returns, without any meaningful progress in technology, without these tools and services and worlds ever actually manifesting. That’s key to understanding the growing chasm between the narrative of techno-optimists and the reality of our tech-encumbered world.

Emphasis is mine, as it could be transposed word-for-word into the current world of drug development. Consider it a more polite rewording of prof. Taleb’s take.

Commodified knowledge is “general knowledge” in the sense tested by trivia/quiz contests. In grade school, we actually had a subject on the curriculum called “GK” and kids good at it (I was one of them) got put on quiz teams to represent their class or school. General intelligence of the sort we actually have today is simply AIs trained on general (ie commodified) knowledge.

But the theological motte-and-bailey move that conflates it with some totalizing-universal divine-omniscience idea of “Artificial General Intelligence” traps a great many of even the smartest people. A category error motivated by theological yearnings, validated by second-order Labatutian psychoses, sustained by epistemic bubbles, and encouraged by sketchy business roadmaps that need a story to justify trillion-dollar investments.

This is a charitable way of justifying the AI billionaire panhandlers’ selling of large language models as AGI, even putting the term in official titles. Less charitably, they all know what Yann LeCun has been saying for years: LLMs will never reach human level of intelligence (“ChatGPT, make me a sandwich”). Whether LeCun’s own pursuits are wise is a different matter.

Separately, Rao gives some good book tips and Benjamin Labatut’s When We Cease to Understand the World is now on the Pile.

No quotes because, true to form, everything salient is already in the title. Natural continuation of the debate started last week (see the last link), although apparently written before the new arXiv policy for a 1-year ban for hallucinated references.

Healy wrote a book about data visualization so I feel somewhat foolish in writing this, but I do not find Apple Sports’ presentation least bit confusing: the numbers are absolute, the bars show percentage of the total. If the goal is to have more of each (assists, rebounds, steals, etc.) the bigger bar shows the opposing team’s dominance. It’s fine. Healy’s proposed solutions are all notably uglier and demote low-occurrence events like blocks and steals even though they may be crucial in a game. Shows how little both Healy and Gruber — on whose post Healy riffs — know about the game of basketball.

At Compleat Kidz, a fast-growing chain of autism clinics based in North Carolina, the policy is firm: Naps cannot be longer than seven minutes before children are awakened to resume therapy. The company says this is necessary to prevent fraud since clinics can be paid only when children are awake and getting services. But it also allows the clinic to bill insurers or Medicaid for more hours.

Yes, you have read that correctly. Waking up a child after a 7-minute nap to perform “therapy” — as if anything meaningful can be accomplished in that hypnagogic state — is both cruel and unusual. But not a punishment! It is merely a way to avoid fraud while optimizing revenue under the watchful eye of private equity:

Private equity firms have acquired at least 500 clinics over the past decade. “There’s just huge opportunities to grow these businesses and help increase access to care,” said Jon Krieger, a managing partner at Calex, a financial firm that assists with autism clinic mergers and acquisitions. He estimates the market could grow to $90 billion.

Mr. Market is a bad doctor, an even worse vet and, it seems, a most diabolical nanny.


The departure of Marty Makary is looking more and more like a Murder on the Orient Express situation: everyone wanted him out. Well, everyone except for uniQure, Capricor and ImmunityBio who were named in the original version of that Endpoints News story as some of the companies lobbying for Makary’s ouster, then asked for their mentions to be removed, as the Editor’s note now helpfully clarifies. C’mon, people. Own it.


First they came for the programmers… Then they came for the doctors. But not really.

Back in September 2023 I noted that the biggest hurdle for AI completely replacing physicians is the physicality of the job. Sure, LLMs are good at giving differential diagnoses and faking empathy once somebody’s problem has been reduced to text, but the art of medicine is in the act of seeing, feeling, smelling, etc. [Note: Although increasingly less so, as doctors and trainees are becoming experts at treating patients in the chart and not those in front of them, making themselves the perfect foils for replacement; queue photo of the old man yelling at clouds. ] If clankers have any hope of replacing humans, they’d better get some senses.

At first glance, a recent Nature Medicine paper aimed to do just that by introducing what the group of authors — all of them Google employees based in the UK and California — call “multimodal reasoning” but is in fact the chatbot being able to interpret images, ECGs and lab reports in addition to the pre-digested clinical pearl. The topline result, one that the journal itself felt obligated to headline, was that “AI had superior performance compared with physicians for almost every metric (29 of 32 axes)”. But at what?

You would think that the question would have been easy to answer, this being a peer-reviewed paper and all, but no. In fact, I am still not completely certain what interactions were performed and whether they completely match what was reported. What is certain is that a set of primary care physicians and patient-actors from Canada and India — countries different from the author’s own countries and let’s wonder conspiratorially for why that may be the case — interacted via an instant messaging-like service. This is the first oddity: even remote health visits are performed using video calls, and yes you may occasionally get a text through the EMR or if you are a VIP/boutique physician maybe your phone, but that is far from the norm.

The primary report is on what happened when the patients uploaded the skin photos, ECGs, lab results, etc. and then asked the physician or LLM on the other end questions about it. Pretty standard fare for a human-to-LLM interaction, but not exactly natural for a doctor-patient relationship which usually starts with questions being asked of the patient. This is the second way in which the setup was made to fit the computer and not the human.

But then the last section of the paper is about what happens when there is, in fact, a back-and-forth by the way of taking a history. The extended figures — “extended” here meaning not worthy enough of being included in the main paper — say it improves the performance of the LLM. They do not say how it affected the human performance, or how the patient-actors rated humans versus LLMs in history-taking. I would call that strike three.

To the journal’s credit, they did not allow Google to get away with it completely. “To evaluate the performance of our finalized system, we conducted a randomized, blinded human evaluation that emulates an objective structured clinical examination”, says the final paragraph of the introduction, only to end with:

We note, however, that our study is not a randomized clinical trial with prespecified endpoints and preregistered statistical analysis. Rather, it is an exploratory study investigating the properties of multimodal diagnostic dialogue.

Peer review is at least good for something, even if it does result in self-contradiction.

Meanwhile, in the world without motivating reasoning, more objective assessments of the usefulness of AI in medicine show that it is in fact still quite bad. This does not prevent the massively funded hordes of AI researchers from flooding the field with sloppy work, creating the impression that the rise of the machines is imminent. Comply or relegate yourself to the permanent underclass, serf MD. But of course, relegation will only be possible to the extent doctors — or any other profession, really — has already debased itself and abandoned its core professional principles in the service of electronic ease.