OpenEval Project - Transparent LLM-Enabled Peer Review

The OpenEval Project

The OpenEval Project provides transparent, systematic evaluation of large language model (LLM) performance in scientific peer review. We compare LLM-generated reviews against traditional peer reviews to assess their accuracy, consistency, and reliability in identifying claims and evaluating scientific evidence. Explore our dataset of manuscripts below. Click any paper to view detailed claim-by-claim comparisons between LLM and peer reviewer assessments. All papers are published under the CC-BY license and are available in original form on the eLife website. Papers rendered here are modified by highlighting of their claims.

Papers Evaluated

Claims Extracted

OpenEval Reviews

Peer Reviews

Comparisons Made

Processed Manuscripts

Sort by:

Loading manuscripts...