OpenScholar Beats ChatGPT In Scientific Citation Accuracy

0
1
OpenScholar Trained On 45 Million Papers Outperforms ChatGPT And Other LLMs In Scientific Citation Accuracy And Research Usefulness
OpenScholar Trained On 45 Million Papers Outperforms ChatGPT And Other LLMs In Scientific Citation Accuracy And Research Usefulness

University of Washington researchers release OpenScholar, an open source scientific LLM, that beats proprietary tools like ChatGPT in citation accuracy and trusted literature synthesis.

OpenScholar, an open source large language model built specifically for scientific literature search and synthesis, has outperformed proprietary systems such as ChatGPT, GPT-4o and Perplexity in citation accuracy and answer usefulness. The research, published in Nature, positions the transparent tool as a credible alternative to black-box generative AI for science.

Developed by computer scientists Hannaneh Hajishirzi and Akari Asai at the University of Washington, the model was trained exclusively on 45 million open access scientific papers. It uses retrieval-augmented generation (RAG) to incorporate new information beyond its training data, reducing hallucinations, outdated responses and irrelevant citations.

Automatic testing showed higher citation accuracy than competing models. In manual evaluations, 16 domain experts compared AI responses with human-written answers. OpenScholar’s outputs were rated more useful more than 50 per cent of the time, largely because they were more comprehensive and typically twice as detailed.

Demand surfaced quickly after an early demo release. “quicky, we got a lot of queries, far more than we’d expected. It really speaks to the need for this sort of open-source, transparent system that can synthesize research,” said Hajishirzi. She added, “but the big question ultimately is whether we can trust that its answers are correct,” highlighting concerns with general-purpose AI.

Asai noted, “It might cite some research papers that weren’t the most relevant or cite just one paper or pull from a blog post randomly,” adding, “We’ve already seen a lot of scientists using OpenScholar because it’s open source. Others are building on this research and already improving on our results.”

The team is now developing Deep Research Tulu to deliver even more comprehensive scientific responses.

LEAVE A REPLY

Please enter your comment!
Please enter your name here