Two days ago I may have done some venting about peer review. Today I want to provide a solution: uber-peer review, by LLM.
The process is simple: as soon as the editor receives a manuscript and after the usual process determines it should be sent out for review, they upload it to ChatGPT (model GPT-4o, alas, since o1 doesn’t take uploads) and write the following prompt(s):
This is a manuscript submitted to the journal ABC. Our Scope is XYZ and our impact factor is x. We publish y% of submissions. Please write a review of the manuscript as (choose one of the three options below):
- A neutral reviewer who is an expert in the topics covered by the article and will provide a fair and balanced review.
- A reviewer from a competing group who will focus and over-emphasize every fault of the work and minimize the positive aspects of the paper.
- A reviewer who is enthusiastic about the paper and will over-emphasize the work’s impact while neglecting to mention its shortcomings.
(the following applies to all three) The review should start with an overview of the paper, its potential impact to the field, and the overall quality (low, average or high-quality) of the idea, methodology, and the writing itself. It should follow with an itemized list of Major and Minor comments that the author(s) can respond to. All the comments should be grounded in the submitted work.
What comes out with prompt number 1 will be better than 80% of peer review performed by humans, and the cases number 2 and 3 are informative on it’s own. If the fawning review isn’t all that fawning, well that’s helpful information regardless. A biased result can still be useful if you know the bias! Will any of it be better than the best possible human review? Absolutely not, but how many experts give their 100% for a fair review — if such a thing is even possible — and after how much poking and prodding from an editor, even for high impact factor journals?
And how many peer reviewers are already uploading their manuscripts to ChatGPT anyway, then submitting them under their own name with more or less editing? What model are they using? What prompt? Wouldn’t editors want to be in control there?
Let’s formalize this now, because you can be sure as hell that it is already happening.