Posts in: rss

Casey Handmer on LLMs:

Every time one of the labs releases an updated model I give it a thorough shakedown on physics, in the style of the oral examination that is still used in Europe and a few other places. Claude, Grok, Gemini, and GPT are all advancing by leaps and bounds on a wide variety of evals, some of which include rather advanced or technical questions in both math and science, including Physics Olympiad-style problems, or grad school qualifying exams.

And yet, none of these models would be able to pass the physicist Turing test. It’s not even a matter of knowledge, I know of reasonably talented middle schoolers with no specialized physics training who could reason and infer on some of these basic questions in a much more fluent and intuitive way.

Alexander the Great had Aristotle, some poor kid will have a brain-dead version of Wheatley.

(Casey’s post is deeper than simple LLM-trashing for he gives the actual 8-step process of reasoning through physics problems, so please do read the whole thing.)


In what feels like a troll but is in all likelihood completely serious, some parents are deciding to have their children fully immersed in AI LLMs:

We’re declaring bankruptcy on humans. Bring on the AI. In addition to integrating AI into as many facets of our lives as possible (our health, our work, our entertainment, and our personal lives), we’re designing an AI-integrated childhood for our kids—all while feeling like we’re helping them dodge a major bullet.

Did CS Lewis suspect, when he wrote The Abolition of Man, that the anti-human sentiment would be expressed as freely and overtly as the first sentence of this intellectually bankrupt paragraph? A paragraph that would be horrifying even if the AI it touts was actual intelligence, an AGI, but what these families are actually immersing themselves in is industrial-grade bullshit. As useful as bullshit can be — I hear it makes for great fertilizer! — one should not drink it as one would do Kool-Aid.


Nick Maggiulli on why the upper middle class isn’t special anymore:

Picture it. You’re at one of the nicest resorts in one of the most prized vacation destinations in the world and there are literal millionaires scrambling to get pool chairs at 8AM. What the hell is going on?

I’ll tell you. The upper middle class is getting too big. There are too many people who are millionaires and multi-millionaires and there simply isn’t enough space to accommodate them. Why do you think the Amex lounge is a zoo? Why do you think house prices haven’t come down? Why do you think vacations evolved into cut throat competitions?

Because there are too many people with lots of money.

I think he is onto something, for here is Jennifer Bradley Franklin of the NYT writing about $9,000 jigsaw puzzles:

Christine Murphy thinks she has a problem.

The 42-year-old grant writer and novelist has more than 150 puzzles in her collection at home in Portland, Maine, approximately 50 of which are hand-cut hardwood. She has one in progress at all times, and works on it every day.

“If I don’t get to do it, I get a bit glum,” she said. “I would happily do nothing but massive, thousand-piece hand-cut puzzles.” But, she added, referring to their price: “My God, those are multiple mortgage payments. It’s like a couture puzzle.”

A Stave Puzzles 800-piece limited edition costs $8,495 (on sale from $8,995). Orders from the company, founded in 1974, go up from there. A recent order from a single customer was close to $40,000, said Paula Tardie, an owner of Stave. “We have done wedding favors, puzzles for opening night gifts for Broadway shows and some very large puzzles for family reunions.”

“We have a couple of customers who, in the last decade, have spent over $500,000 with us,” said Mr. Danner of Elms.

If $9K can’t even get you a decent resort holiday, blowing it all an puzzles is as good as anything.


Two unrelated articles about AI greeted me from the feed reader this morning:

Both are worth reading, and Stephenson’s in particular may lead you down some nice rabbit holes owing to his profuse linking.


Cal Newport’s latest article about common sense in parenting closes with this punchline:

If you’re uncomfortable with the potential impact these devices may have on your kids, you don’t have to wait for the scientific community to reach a conclusion about depression rates in South Korea before you take action.

But does anyone — Georgetown math professors notwithstanding — make decisions this way, neatly compartmentalizing “the science” from their moral intuition? Or is there a mutually reinforcing interaction between the two, with our intuition exposing us to the confirmatory facts?


If this interview is anything to go by, Kevin Kelly is a wonderful human being and a true role model.

Not to put them on the same level — there is a whole generation between them — but the article reminded me of a similar conversation with Merlin Mann, now more than a decade all. Good Sunday reads both.


A few links for the weekend, kind-of-sort-of in the spirit of Good Work:


An excellent blog post that is not a rant: Single-function devices in the world of the everything machine, by Christopher Butler.

Limitations expand our experience by engaging our imagination. Unlimited options arrest our imagination by capturing us in the experience of choice. One, I firmly believe, is necessary for creativity, while the other is its opiate. Generally speaking, we don’t need more features. We need more focus.

Indeed.


Some of the best blog posts are rants, and Andrew Gelman just published one, about reckless disregard for the truth. Here is why he thinks the term “bullshit” does not apply:

In my post, I asked what do you call it when someone is lying but they’re doing it in such a socially-acceptable way that nobody ever calls them on it? Some commenters suggested the term “bullshit,” but that didn’t quite seem right to me, because these people seemed pretty deliberate in their factual misstatements.

I disagree. Whether the bullshitter is deliberate should not matter, and many do indeed BS with a specific goal in mind. In the examples he lists those are inflating the impact of a paper and getting paid for expert testimony in favor of big tobacco. Indeed, dig deep enough and you will find hunger for money and prestige to be at the root of much bullshit.


A few good links to start the week: