AI supercharges scientific output while quality slips
AI writing tools are supercharging scientific productivity, with researchers posting up to 50% more papers after adopting them. The biggest beneficiaries are scientists who don’t speak English as a first language, potentially shifting global centers of research power. But there’s a downside: many AI-polished papers fail to deliver real scientific value. This growing gap between slick writing and meaningful results is complicating peer review, funding decisions, and research oversight.
After ChatGPT became widely available in late 2022, many researchers started telling colleagues they could get more done with these new artificial intelligence tools. At the same time, journal editors reported a surge of smoothly written submissions that did not seem to add much scientific value.
A new Cornell study suggests those informal reports point to a broader change in how scientists are preparing manuscripts. The researchers found that large language models (LLMs) such as ChatGPT can increase paper output, with especially strong benefits for scientists who are not native English speakers. But the growing volume of AI written text is also making it harder for key decision makers to tell meaningful work apart from low value content.
"It is a very widespread pattern, across different fields of science -- from physical and computer sciences to biological and social sciences," said Yian Yin, assistant professor of information science in the Cornell Ann S. Bowers College of Computing and Information Science. "There's a big shift in our current ecosystem that warrants a very serious look, especially for those who make decisions about what science we should support and fund."
The findings appear in a paper titled "Scientific Production in the Era of Large Language Models," published Dec. 18 in Science.
How the Cornell Team Measured AI Use in Research Papers
To examine how LLMs are influencing scientific publishing, Yin's team compiled more than 2 million papers posted from January 2018 through June 2024 across three major preprint platforms. Those sites are arXiv, bioRxiv and Social Science Research Network (SSRN). Together, they represent the physical sciences, life sciences and social sciences, and they host studies that have not yet been through peer review.
The researchers used papers posted before 2023 that were presumed to be written by humans and compared them with AI generated text. From that comparison, they built a model designed to flag papers that were likely written with help from LLMs. Using this detector, they estimated which authors were probably using LLMs for writing, tracked how many papers those scientists posted before and after adopting the tools, and then checked whether the papers were later accepted by scientific journals.
Big Productivity Gains, Especially for Non Native English Speakers
The results showed a clear productivity jump linked to apparent LLM use. On arXiv, scientists flagged as using LLMs posted roughly one third more papers than those who did not appear to use AI. On bioRxiv and SSRN, the increase exceeded 50%.
The boost was largest for scientists who write in English as a second language and face extra hurdles when communicating technical work in a foreign language. For example, researchers affiliated with Asian institutions posted between 43.0% and 89.3% more papers after the detector suggested they began using LLMs, compared with similar researchers who did not appear to adopt the technology, depending on the preprint site. Yin expects the advantage could eventually shift global patterns of scientific productivity toward regions that have been held back by the language barrier.