It has been more than six years ago now that Nassim Taleb rightfully called IQ a pseudoscientific swindle. Yet this zombie idea keeps coming back, most recently as a meandering essay by one Crémieux who, through a series of scatter plots and other data visualizations, attempts to persuade that “National IQs” are a valid concept and that yes, they are much lower in South Asia and Subsaharan Africa than the rest of the world.
This hogwash prompted another series of exchanges on IQ ending, for now, with this X post that recapped some points from Taleb’s original essay for a lay audience. That alone is worth reposting, but what I thought was even more interesting was one of the replies:
But I still prefer my doctor or pilot or professor to have an iq of over 120 (at least). I am sure it matters. Not as the only characteristic, but still.
While missing the point so completely that it wasn’t worth replying to, the post is a good example of another IQ-neutral human trait, to hypothesize on properties in isolation without considering nth-order effects. Let’s say your surgeon’s IQ is 160. What are the implications for their specialty of choice, fees, where they work, and bedside manner? Are they more or less of a risk-taker because of this? Does their intellectual savvy transfer more to their own bottom line, picking high-reimbursement procedures over a more conservative approach? Even if you said “all else being equal I’d prefer someone with a higher IQ”, well, why would you if everything else was equal? In that case would it not even make more sense to pick someone who did not have the benefit of acing multiple choice questions based on pure reasoning rather than knowledge? And yes, Taleb wrote about that as well.
Another set of replies was on the theme of “well I don’t think we could even have a test that measures IQ”, showing that they don’t know what IQ is — it is the thing measured by an IQ test. There is some serious confusion in terms here and X is the worst place to have a discussion about it, everyone shouting over each other.
Finally, since I agree with Taleb that IQ as used now is a bullshit concept, people may surmise as they did for him that I took the test and that I am now, disappointed in the result, trying to discredit it. I do think it’s BS for personal reasons, but of a different kind: some 25 years ago as a high school freshman in Serbia I took the test and was accepted to mensa. Having attended a single, tedious meeting in Belgrade shortly afterward I saw that the whole thing was indeed laughable and haven’t thought about it again until reading that 2019 essay.
Having a high IQ means you are good at taking tests, and correlates with success in life as much as your life is geared towards test-taking. There is nothing else “there” there and good test-takers unhappy with their lives should focus on other of life’s many questions, like how to execute a proper deadlift and whether home-made fresh pasta is better than the dried variant.
The founders of statistics were a bunch of drunk gamblers so I’m used to seeing parallels between games of all kinds and science. In that vein, Andrew Gelman writing about cheating at board games made me think of cheating in clinical trials, only the right parallel there would be cooperative games in the style of Pandemic and the many Arkham/Eldritch Horror games made by Fantasy Flight.
These are tough games — Eldritch Horror in particular — where players and rule keepers are one and the same and all on the same team. And they are easy to beat if the players are willing to fudge a rule here or re-roll the dice there. And to be clear some of the rules are punishing including — from my absolute favorite, Arkham Horror: The Card Game — The Grim Rule:
If players are unable to find the answer to a rules or timing conflict in this Rules Reference, resolve the conflict in the manner that the players perceive as the worst possible at that moment with regards to winning the scenario, and continue with the game.
The Bonferroni correction for multiplicity could be a form of TGR, as could sensitivity analyses with particularly harsh assumptions. Note that TGR is a shortcut intended to enhance enjoyment of the game. Sure, your investigators may be devoured by Cthulhu or all go insane, but even that is more fun than 30 minutes spent looking up and cross-checking arcane footnotes in thick rule books, or worse yet trawling through Reddit for rule tips. Note also that clinical trials and science are not there (only) for fun and enjoyment and that applying TGR has consequences more serious than having to restart a game.
Which is to say, it pays off to be more thoughtful about your analyses and assumptions when designing a study and think — am I making this assumption and doing things this way because I don’t want to fool myself, or because the alternative would take too much time? The same goes for assumptions that are too lenient, but of course you already know that.
Visiting San Francisco and just had my first Waymo ride. It was the most obedient, defensive, proper driving I have ever seen, at once frustrating and uplifting. The world would be a better place if every car was fully self-driving and I can’t wait for them to come to DC.
I’ve spent enough time in the US to occasionally call sparkling water “club soda”, but I will never ever be American enough to call it friggin' seltzer.
I enjoyed Peter Thiel’s book and even more so Ryan Holiday’s book about Thiel. We also have a “mutual like” — René Girard — and the person whose book Wanting introduced me to Girard, Luke Burgis, is a stand-up guy who is a bit of a Thielophile. So, when Peter Thiel publishes an unpaywalled opinion piece in my newspaper of choice you can be sure I’ll read it. And so I did, with interesting results
You see, Thiel seems to have a grievance against America. Here is an immigrant to the US of A who earned billions while in the country and co-founded a corporate behemoth whose major source of revenue is the US federal government who does not seem to think the country is headed in the right direction. In a feat of projection, he presents FT’s readers with a laundry list of 20th and 21st-century conspiracy theories, from the murder of JFK to that of Jeffrey Epstein (sic!) to the the covid-19 lab leak, then notes:
Perhaps an exceptional country could have continued to ignore such questions, but as Trump understood in 2016, America is not an exceptional country. It is no longer even a great one.
I will remind you that Holiday’s book about Thiel — “Conspiracy” — was, in fact, about Peter Thiel’s conspiracy to destroy Gawker as revenge for outing him as a gay man living in San Francisco.
Note that he doesn’t say exceptional countries wouldn’t have “these questions”, but rather that they could have ignored them. Indeed, the exceptional 1960s America ignored so many, mostly thanks to what Eric Weinstein — name-checked in Thiel’s essay — calls The Distributed Idea Suppression Complex (The DISC): a confluence of mass media and vetted experts who act as the great filter of what can be spoken about in polite company. So is America no longer exceptional because DISC broke down? Or Because its citizens no longer have anything better to preoccupy themselves with?
Or perhaps I am overthinking the text’s possible Straussian reading. It could just as well be a middle finger to FT’s regular audience, the mass media-consuming elites, and a victory sign pointed at the unwashed internet masses able to climb the FT paywall on piles of Peter Thiel’s money (because I doubt the article became ungated out of the goodness of FT management). If so, kudos.
Today’s Slow Boring update started off great (the Home Alone house!), then came this whopper of a reasoning flaw and I stopped reading in frustration.
You can’t make any conclusions out of junk data, people, though apparently you can write a 5,000-word essay.
Today’s Stratechery update from Ben Thompson is about censorship and it is too bad that there is a paywall — email me if you’d like it forwarded — because it is the best overview of our current predicament. Ada Palmer’s Tools for Thinking about Censorship is still the best historical perspective.
❄️ DC public schools are back in session after two snow days and on one hand this is a relief — no one wants to make up extended snow days in the summer — but then most streets are still not plowed and have 0.5 lanes of traffic open making the morning drive a hazard. What is all this equipment for?
Two days ago I may have done some venting about peer review. Today I want to provide a solution: uber-peer review, by LLM.
The process is simple: as soon as the editor receives a manuscript and after the usual process determines it should be sent out for review, they upload it to ChatGPT (model GPT-4o, alas, since o1 doesn’t take uploads) and write the following prompt(s):
This is a manuscript submitted to the journal ABC. Our Scope is XYZ and our impact factor is x. We publish y% of submissions. Please write a review of the manuscript as (choose one of the three options below):
- A neutral reviewer who is an expert in the topics covered by the article and will provide a fair and balanced review.
- A reviewer from a competing group who will focus and over-emphasize every fault of the work and minimize the positive aspects of the paper.
- A reviewer who is enthusiastic about the paper and will over-emphasize the work’s impact while neglecting to mention its shortcomings.
(the following applies to all three) The review should start with an overview of the paper, its potential impact to the field, and the overall quality (low, average or high-quality) of the idea, methodology, and the writing itself. It should follow with an itemized list of Major and Minor comments that the author(s) can respond to. All the comments should be grounded in the submitted work.
What comes out with prompt number 1 will be better than 80% of peer review performed by humans, and the cases number 2 and 3 are informative on it’s own. If the fawning review isn’t all that fawning, well that’s helpful information regardless. A biased result can still be useful if you know the bias! Will any of it be better than the best possible human review? Absolutely not, but how many experts give their 100% for a fair review — if such a thing is even possible — and after how much poking and prodding from an editor, even for high impact factor journals?
And how many peer reviewers are already uploading their manuscripts to ChatGPT anyway, then submitting them under their own name with more or less editing? What model are they using? What prompt? Wouldn’t editors want to be in control there?
Let’s formalize this now, because you can be sure as hell that it is already happening.
Much has been written and said about the faults of peer review but one thing I think hasn’t been emphasized enough so I’ll state it here: journal editors need to grow a spine. And they need to grow it in two ways, first by not sending obviously flawed studies out for peer review no matter where they come from, then by saying no to reviewers' unreasonable demands, not taking their comments at face value, and sometimes just not waiting 6+ months for a review to come back before making a decision.