Deep Research continues to impress: here is a 4000-word essay on how the word “Pumpaj” — Serbian for “Pump!” — became the slogan of the 2024/25 protests. Even the prompt was LLM-engineered, as described in this Reddit post. So it goes…
It isn’t only grad students who should be worried about Deep Research:
Students cannot be expected to continue paying for information transfer that AGI provides freely. Instead, they will pay to learn from faculty whose expertise surpasses AI, offering mentorship, inspiration, and meaningful access to AGI-era careers and networks. Universities that cannot deliver this specific value will not survive. This isn’t a mere transformation but a brutal winnowing—most institutions will fail, and those that remain will be unrecognizable by today’s standards.
Yikes! This is from Hollis Robbins, much more in-depth and thought out than my rapid review, though I take issue with her sticking the G in between the A and the I, because we are no there yet. (ᔥTyler Cown)
Deep Research is the real deal, big changes ahead
One query in, I am convinced of the value of Deep Research and think it is well worth the $200 per month. The sources are real, the narrative less fluffy, the points it makes cogent. The narrative review is not dead yet, but it is on its way out. Here I am thinking about those reviews that are made to pad junior researchers CVs while they are introducing themselves to a field, neutral in tone and seemingly comprehensive in scope. There will always be a place for an opinionated perspective from a leader in the field.
In a year, the AI algorithms went from an overeager undergrad to a competent graduate student in every field of science, natural or social. Would o3 make this into a post-doc able and willing to go down every and any rabbit hole? Even now two hundred dollars per month is a bargain — if the price stays the same with next generation models it will be a steal.
The one snag is that it is all centralized, and yes the not so open OpenAI sees all your questions and knows what you want. For now. Local processing is a few years behind, so what is preventing nVidia or Apple or whomever from putting all its efforts into catching up? How much would you pay for your own server that would give its in-depth reports more slowly — say 30 minutes instead of 5 — but be completely private? And without needing benefits, travel and lodging to conferences or any of the messy HR stuff.
The brave new world is galloping ahead.
(↬Tyler Cowen)
It has been almost a year and AVP continues to be almost a product
Has it been a year since Apple Vision Pro came out? It looks like it. And a year in, it is clear that it is great for two and only two very specific use cases:
- Watching media by yourself
- Being hyper-productive in confined quarters for long stretches of time.
Number 2 only became viable a few months ago when they turned on the ultra-ultra-ultra-wide display option, but that has become my main use for it. You need a long stretch of time because it is not convenient to take it on and off constantly. Since my work day is interrupted by meetings these stretches of time are few and far between.
A third use case may pop up if Apple actually enables 3rd-party controllers and developers actually port games to it, neither of which is a given. So, the uses may expand, slowly, and the user base with them, but I did a quick search on AVP gaming just now and the top articles on Kagi — here is one — are from just before and just after the release. That’s telling.
Visiting San Francisco and just had my first Waymo ride. It was the most obedient, defensive, proper driving I have ever seen, at once frustrating and uplifting. The world would be a better place if every car was fully self-driving and I can’t wait for them to come to DC.
Journals should formalize AI "peer" review as soon as possible — they are getting them anyway
Two days ago I may have done some venting about peer review. Today I want to provide a solution: uber-peer review, by LLM.
The process is simple: as soon as the editor receives a manuscript and after the usual process determines it should be sent out for review, they upload it to ChatGPT (model GPT-4o, alas, since o1 doesn’t take uploads) and write the following prompt(s):
This is a manuscript submitted to the journal ABC. Our Scope is XYZ and our impact factor is x. We publish y% of submissions. Please write a review of the manuscript as (choose one of the three options below):
- A neutral reviewer who is an expert in the topics covered by the article and will provide a fair and balanced review.
- A reviewer from a competing group who will focus and over-emphasize every fault of the work and minimize the positive aspects of the paper.
- A reviewer who is enthusiastic about the paper and will over-emphasize the work’s impact while neglecting to mention its shortcomings.
(the following applies to all three) The review should start with an overview of the paper, its potential impact to the field, and the overall quality (low, average or high-quality) of the idea, methodology, and the writing itself. It should follow with an itemized list of Major and Minor comments that the author(s) can respond to. All the comments should be grounded in the submitted work.
What comes out with prompt number 1 will be better than 80% of peer review performed by humans, and the cases number 2 and 3 are informative on it’s own. If the fawning review isn’t all that fawning, well that’s helpful information regardless. A biased result can still be useful if you know the bias! Will any of it be better than the best possible human review? Absolutely not, but how many experts give their 100% for a fair review — if such a thing is even possible — and after how much poking and prodding from an editor, even for high impact factor journals?
And how many peer reviewers are already uploading their manuscripts to ChatGPT anyway, then submitting them under their own name with more or less editing? What model are they using? What prompt? Wouldn’t editors want to be in control there?
Let’s formalize this now, because you can be sure as hell that it is already happening.
Today’s Stratechery update from Ben Thompson is about censorship and it is too bad that there is a paywall — email me if you’d like it forwarded — because it is the best overview of our current predicament. Ada Palmer’s Tools for Thinking about Censorship is still the best historical perspective.
Here are a few links to start off 2025 (see if you can spot a pattern):
- Things we learned about LLMs in 2024 (ᔥDaring Fireball)
- The new Turing test for AI video… is absolutely horrifying (ᔥMR)
- The Ghosts in the Machine
- On skilled immigration
Happy New Year, dear reader!
Additional notes from the future
I was peripherally aware that large language models have crossed a chasm in the last year or so, but I haven’t realized how large of a jump it was until I compared ChatGPT’s answer to my standard question: “How many lymphocytes are there in the human body?”.
Back in February of last year it took some effort to produce an over-inflated estimate. Today, I was served a well-reasoned and beautifuly formatted response after a single prompt. Sure, I have gotten better at writing prompts but the difference there is marginal. Not so marginal is the leap in usefulness and trustworthiness of the model, which went from being an overeager high school freshman to an all-star college senior.
And that is just the reasoning. Creating quick Word documents with tables and columns just the way I want them has become routine, even when/especially if I want to recreate a document from a badly scanned printout. My office document formatting skills are getting rusty and I couldn’t be happier for it.
In his Kefahuchi Tractt trilogy, M. John Harrison conjures up alien algorithms floating around the human environment, mostly helpful, sometimes not, motives unknown. Back in the early 2000s when the first novel came out I was wondering what on earth he was talking about but for better or worse we are now headed towards that world. Whether we are inching or hurling, that depends entirely on your point of view.
(↬Tyler Cowen)
Mozi is a splendid idea for making serendipitous encounters happen. On the other hand, can you truly call these encounters serendipitous if they needed an app? (ᔥMatthew Haughey)