The founders of statistics were a bunch of drunk gamblers so I’m used to seeing parallels between games of all kinds and science. In that vein, Andrew Gelman writing about cheating at board games made me think of cheating in clinical trials, only the right parallel there would be cooperative games in the style of Pandemic and the many Arkham/Eldritch Horror games made by Fantasy Flight.
These are tough games — Eldritch Horror in particular — where players and rule keepers are one and the same and all on the same team. And they are easy to beat if the players are willing to fudge a rule here or re-roll the dice there. And to be clear some of the rules are punishing including — from my absolute favorite, Arkham Horror: The Card Game — The Grim Rule:
If players are unable to find the answer to a rules or timing conflict in this Rules Reference, resolve the conflict in the manner that the players perceive as the worst possible at that moment with regards to winning the scenario, and continue with the game.
The Bonferroni correction for multiplicity could be a form of TGR, as could sensitivity analyses with particularly harsh assumptions. Note that TGR is a shortcut intended to enhance enjoyment of the game. Sure, your investigators may be devoured by Cthulhu or all go insane, but even that is more fun than 30 minutes spent looking up and cross-checking arcane footnotes in thick rule books, or worse yet trawling through Reddit for rule tips. Note also that clinical trials and science are not there (only) for fun and enjoyment and that applying TGR has consequences more serious than having to restart a game.
Which is to say, it pays off to be more thoughtful about your analyses and assumptions when designing a study and think — am I making this assumption and doing things this way because I don’t want to fool myself, or because the alternative would take too much time? The same goes for assumptions that are too lenient, but of course you already know that.
Visiting San Francisco and just had my first Waymo ride. It was the most obedient, defensive, proper driving I have ever seen, at once frustrating and uplifting. The world would be a better place if every car was fully self-driving and I can’t wait for them to come to DC.
I’ve spent enough time in the US to occasionally call sparkling water “club soda”, but I will never ever be American enough to call it friggin' seltzer.
I enjoyed Peter Thiel’s book and even more so Ryan Holiday’s book about Thiel. We also have a “mutual like” — René Girard — and the person whose book Wanting introduced me to Girard, Luke Burgis, is a stand-up guy who is a bit of a Thielophile. So, when Peter Thiel publishes an unpaywalled opinion piece in my newspaper of choice you can be sure I’ll read it. And so I did, with interesting results
You see, Thiel seems to have a grievance against America. Here is an immigrant to the US of A who earned billions while in the country and co-founded a corporate behemoth whose major source of revenue is the US federal government who does not seem to think the country is headed in the right direction. In a feat of projection, he presents FT’s readers with a laundry list of 20th and 21st-century conspiracy theories, from the murder of JFK to that of Jeffrey Epstein (sic!) to the the covid-19 lab leak, then notes:
Perhaps an exceptional country could have continued to ignore such questions, but as Trump understood in 2016, America is not an exceptional country. It is no longer even a great one.
I will remind you that Holiday’s book about Thiel — “Conspiracy” — was, in fact, about Peter Thiel’s conspiracy to destroy Gawker as revenge for outing him as a gay man living in San Francisco.
Note that he doesn’t say exceptional countries wouldn’t have “these questions”, but rather that they could have ignored them. Indeed, the exceptional 1960s America ignored so many, mostly thanks to what Eric Weinstein — name-checked in Thiel’s essay — calls The Distributed Idea Suppression Complex (The DISC): a confluence of mass media and vetted experts who act as the great filter of what can be spoken about in polite company. So is America no longer exceptional because DISC broke down? Or Because its citizens no longer have anything better to preoccupy themselves with?
Or perhaps I am overthinking the text’s possible Straussian reading. It could just as well be a middle finger to FT’s regular audience, the mass media-consuming elites, and a victory sign pointed at the unwashed internet masses able to climb the FT paywall on piles of Peter Thiel’s money (because I doubt the article became ungated out of the goodness of FT management). If so, kudos.
Today’s Slow Boring update started off great (the Home Alone house!), then came this whopper of a reasoning flaw and I stopped reading in frustration.
You can’t make any conclusions out of junk data, people, though apparently you can write a 5,000-word essay.
Today’s Stratechery update from Ben Thompson is about censorship and it is too bad that there is a paywall — email me if you’d like it forwarded — because it is the best overview of our current predicament. Ada Palmer’s Tools for Thinking about Censorship is still the best historical perspective.
❄️ DC public schools are back in session after two snow days and on one hand this is a relief — no one wants to make up extended snow days in the summer — but then most streets are still not plowed and have 0.5 lanes of traffic open making the morning drive a hazard. What is all this equipment for?
Two days ago I may have done some venting about peer review. Today I want to provide a solution: uber-peer review, by LLM.
The process is simple: as soon as the editor receives a manuscript and after the usual process determines it should be sent out for review, they upload it to ChatGPT (model GPT-4o, alas, since o1 doesn’t take uploads) and write the following prompt(s):
This is a manuscript submitted to the journal ABC. Our Scope is XYZ and our impact factor is x. We publish y% of submissions. Please write a review of the manuscript as (choose one of the three options below):
- A neutral reviewer who is an expert in the topics covered by the article and will provide a fair and balanced review.
- A reviewer from a competing group who will focus and over-emphasize every fault of the work and minimize the positive aspects of the paper.
- A reviewer who is enthusiastic about the paper and will over-emphasize the work’s impact while neglecting to mention its shortcomings.
(the following applies to all three) The review should start with an overview of the paper, its potential impact to the field, and the overall quality (low, average or high-quality) of the idea, methodology, and the writing itself. It should follow with an itemized list of Major and Minor comments that the author(s) can respond to. All the comments should be grounded in the submitted work.
What comes out with prompt number 1 will be better than 80% of peer review performed by humans, and the cases number 2 and 3 are informative on it’s own. If the fawning review isn’t all that fawning, well that’s helpful information regardless. A biased result can still be useful if you know the bias! Will any of it be better than the best possible human review? Absolutely not, but how many experts give their 100% for a fair review — if such a thing is even possible — and after how much poking and prodding from an editor, even for high impact factor journals?
And how many peer reviewers are already uploading their manuscripts to ChatGPT anyway, then submitting them under their own name with more or less editing? What model are they using? What prompt? Wouldn’t editors want to be in control there?
Let’s formalize this now, because you can be sure as hell that it is already happening.
Much has been written and said about the faults of peer review but one thing I think hasn’t been emphasized enough so I’ll state it here: journal editors need to grow a spine. And they need to grow it in two ways, first by not sending obviously flawed studies out for peer review no matter where they come from, then by saying no to reviewers' unreasonable demands, not taking their comments at face value, and sometimes just not waiting 6+ months for a review to come back before making a decision.
📚 Finished reading: The Notebook by Roland Allen. It starts off strong, with an anecdote about the creation of the Moleskine brand, then goes in much depth about writing during renaissance and the enlightenment, topping it off with a few modern developments like BuJo. The chapters are self-contained and packed with information without being bogged down into too much detail — the Moleskine chapter is a good example of what to expect — at the expense of an overarching “story”. So, this is a collection of vignettes more than a systemic review and categorization of the types of notebooks through history, and that’s fine.
A few higlights: