Optimized for desktop; consider reading on a larger screen.

There are two ways to show that an AI model is safe: show that it doesn't have dangerous capabilities, or show that it's safe even if it has dangerous capabilities. Currently, AI companies almost exclusively claim that their models don't have dangerous capabilities, on the basis of tests called model evals.[✲]

I think AI companies' evals are often poor. They often show that a model has somewhat concerning capabilities but give us little evidence about whether it has very dangerous capabilities. While I don't believe that AIs have catastrophically dangerous capabilities yet, I'm worried that companies' evals will still be bad in the future. If companies used the best existing evals and—crucially—followed best practices for running the evals and reporting results, the situation would be much better.[✲]

Additionally, there's currently no accountability for companies' evals. Companies should have external auditors check whether their evals are good and determine what they imply about dangerous capabilities.

I'm Zach Stein-Perlman. In this website, I collect and assess the public information on five AI companies' model evals for dangerous capabilities. There is a page on each company; to see one, click on a logo below or use the topbar.

Anthropic LogoOpenAI LogoGoogle DeepMind LogoMeta LogoxAI Logo

New

Blog

This website is a low-key research preview. It's up to date as of May 26, but lacking analysis for Claude 4.