Mac apps old and new
This week was a good one for learning about new apps:
- Zen browser puts an Arc skin over a Firefox core. It has already become my default browser. (ᔥBen Werdmuller)
- Macrowave enables AV streaming to anyone. I do plan on using it to stream but for now I am just marveling at the WinAmp esthetics. (ᔥGus Mueller)
- Inksightful digitizes paper notes. I am on the waitlist for the beta. (ᔥThe Iconfactory)
On the topic of apps, here are some fairly new ones (i.e. I have been using them for less than 5 years) that are passing the test of time, with honors:
- Tot, which is still my default for digital note-taking on the both the Mac and on iOS
- Flighty, which is invaluable for those flying in the US. Alas, it is not as useful, I have been finding out this summer, for travel in Europe. Those Europeans are not very good at updating their systems.
- The Cardhop/Fantastical combo, without which I could not imagine juggling between four different contact lists and a half-dozen calendars.
- MarsEdit, which I missed out on listing initially even as I was using it to write this very post. It is that essential that I consider it a part of the OS at this point.
And the apps that make me stick to the Mac even though I entertain from time to time the possibility of switching over to Linux:
- MailMate (since 2013)
- DEVONthink (since 2015)
- OmniFocus (since 2016)
- Tinderbox (since 2017)
- Audio Hijack and Sound Source (since 2020)
As much as I like the Mac hardware and the sleek aluminum esthetics, it is the software listed above that keeps me in the ecosystem. None of it is made by Apple.
The most recent issue of the FT Weekend Magazine is about games of all kinds, but the highlight is a massive article about the tragedy of Disco Elysium. It is depressing throughout, with a glimmer of hope buried near the end:
Kurvitz is making his next game at a new studio, Red Info, with Aleksander Rostov, Helen Hindpere and Chris Avellone, lead writer of the 1999 video game Planescape: Torment, a huge influence on Disco. “[Kurvitz] felt that Disco was the project in his head, and once he was cut off from the franchise, he was worried he didn’t have any other ideas in him,” Avellone told me. “I felt that was bullshit . . . Robert’s too creative to simply ‘not’ create something or rely on a single world idea in his head.”
Creators of Planescape: Torment and Disco Elysium working together on a new game? Be still, my heart.
Just another ("AI") Friday
- From the FT Editorial Board: The risk of letting AI do your thinking (the only nit I have to pick here is that by “AI” the esteemed Board means “LLMs” or, if you want to be kind and stretch the definition of intelligence, “generative AI”)
- From Dave Winer: AI should behave like a computer (see previous note)
- From a person on the Internet: Cognitive Hygiene: Why You Need to Make Thinking Hard Again (file under YouTube videos that should have been blog posts)
- From Michael Lopp: Every Single Human. Like. Always. (subtitle: The robots… They did the thing.)
And with these four links I hereby declare a moratorium on LLM-related matters on this blog, until further notice.
For a contrarian take on LLMs as intelligent machines, here is Alexey Guzey saying that:
- ChatGPT understands what it reads
- LLMs are creative
- Compression is intelligence and ChatGPT compresses really well
- The idea of AGI is stupid
- It doesn’t matter if AGI is real or not
I remain dubious.
An important note from Dave Winer:
I say ChatGPT instead of “AI” because I’m not comfortable characterizing it as intelligence. Deeper you get into it you learn that these beings whatever they are have serious character flaws that are counter-intelligent.
Exactly. LLMs are closer in intelligence to a screwdriver than to a human.
Flighty does not seem to be as up-to-date traveling internationally as it is on domestic flights. The IST airport departures board had our flight listed as delayed as soon as we got there, yet the app thought everything was fine. Trust no one.
Casey Handmer on LLMs:
Every time one of the labs releases an updated model I give it a thorough shakedown on physics, in the style of the oral examination that is still used in Europe and a few other places. Claude, Grok, Gemini, and GPT are all advancing by leaps and bounds on a wide variety of evals, some of which include rather advanced or technical questions in both math and science, including Physics Olympiad-style problems, or grad school qualifying exams.
And yet, none of these models would be able to pass the physicist Turing test. It’s not even a matter of knowledge, I know of reasonably talented middle schoolers with no specialized physics training who could reason and infer on some of these basic questions in a much more fluent and intuitive way.
Alexander the Great had Aristotle, some poor kid will have a brain-dead version of Wheatley.
(Casey’s post is deeper than simple LLM-trashing for he gives the actual 8-step process of reasoning through physics problems, so please do read the whole thing.)
In what feels like a troll but is in all likelihood completely serious, some parents are deciding to have their children fully immersed in AI LLMs:
We’re declaring bankruptcy on humans. Bring on the AI. In addition to integrating AI into as many facets of our lives as possible (our health, our work, our entertainment, and our personal lives), we’re designing an AI-integrated childhood for our kids—all while feeling like we’re helping them dodge a major bullet.
Did CS Lewis suspect, when he wrote The Abolition of Man, that the anti-human sentiment would be expressed as freely and overtly as the first sentence of this intellectually bankrupt paragraph? A paragraph that would be horrifying even if the AI it touts was actual intelligence, an AGI, but what these families are actually immersing themselves in is industrial-grade bullshit. As useful as bullshit can be — I hear it makes for great fertilizer! — one should not drink it as one would do Kool-Aid.
📚 Finished reading: In the Beginning… Was the Command Line by Neal Stephenson, almost thirty years old and more relevant than ever. Download it for free here, and if you think you don’t have time for all 65 pages Chapter 12 about the Hole Hawg should motivate you to read the entire essay.
A brief note on AI peer review, education and bullshit
When I wrote about formalizing AI “peer” review I meant it as a tongue-in-cheek comment on the shoddy human peer review we are getting anyway. “Wittgenstein’s ruler: Unless you have confidence in the ruler’s reliability, if you use a ruler to measure a table you may also be using the table to measure the ruler. The less you trust a ruler’s reliability (in probability called the prior), the more information you are getting about the ruler and the less about the table.”, Nassim Taleb in Fooled by Randomness. Peer reviewers are the ruler, the articles are the table, and there is zero trust in the ruler’s reliability. It was also (1) a bet that the median AI review would soon be better than the median human review (and remember, the median journal article is not submitted to Nature or Cell but to a journal that’s teetering on being predatory), and (2) a prediction that the median journal is already getting “peer” reviews mostly or totally “written” by LLMs.
Things have progressed since January on both of these fronts. In a textbook example of the left hand not knowing what the right hand is doing, some journals are (unintentionally?) steering their reviewers towards using AI while at the same time prohibiting AI from being used. And some unscrupulous authors are using hidden prompts to steer LLM review their way (↬Andrew Gelman). On the other hand, I have just spent around 4 hours reviewing a paper without using any AI help whatsoever, and it was fun. More generally, despite occasionally writing about how useful LLMs can be, my use of ChatGPT has significantly decreased since I fawned over deep research.
Maybe I should be using it more. Doc Searls just wrote about LLM-driven “Education 3.0”, with some help from a sycophantic ChatGPT which framed eduction 1.0 as “deeply human, slow, and intimate” (think ancient Greeks, the Socratic method and the medieval Universities), 2.0 as “mechanized, fast, and impersonal” (from the industrial revolution until now), and 3.0 as “fast and personal”. Should I then just let my kids use LLMs whenever, unsupervised, like Neal Stephenson’s Primer (“an interactive book that will adapt as the user grows and learns”)? But then would I want my kids hanging out with a professional bullshitter? Helen Beetham has a completely contrarian stance — that AI is the opposite of education — and her argument is more salient, at least if we take AI to mean only LLMs. Hope lies eternal that somebody somewhere is developing actual artificial intelligence which could one day lead to such wonderful things as the “Young Lady’s Illustrated Primer”.
Note the emphasis on speed in the framing of Education 3.0. I am less concerned about LLM bullshit outside of education, in a professional setting, since part of becoming a professional is learning how to identify bullshitters in your area of expertise. But bullshit is an obstacle to learning: this is why during medical school in Serbia I opted for reading textbooks in English rather than inept translations to Serbian made by professors with an aptitude for bulshitting around ambiguity. This is, I suppose, the key reason why we need LLMs there in the first place for there is nothing stopping a motivated learner from browsing wikipedia, reading any number of freely available masterworks online, watching university lectures on YouTube, and interacting with professionals and fellow learners via email, social networks, Reddit and what not. But you need to be motivated either way: to be able to wait and learn without immediate feedback in a world without LLMs, or to be able to wade through hallucinations and bullshit that LLMs can generate immediately. Education faces a bootstrapping problem here, for how can you recognize LLM hallucinations in a field you yourself are just learning?
The through-line for all this is motivation. If you review papers in order to check a career development box, to get O1 visa/EB1 green card status, and/or get brownie points from a journal I suspect you would see it as a waste of time and take any possible shortcut. But if you review papers because of a sense of duty, for fun, or to satisfy a sadistic streak — perhaps all three! — why would you want to deprive yourself of the work? Education is the same: if you are learning for the sake of learning, why would you want to speed it up? Do you also listen to podcasts and watch YouTube lectures at 2x? Of course, many people are not into scientia gratia scientiae and are doing it to get somewhere or become something, in which case Education 2.0 should be right up their alley, along with the playback speed throttle.