Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult
SMRTR summary
Anthropic launched Claude Opus 4.5, claiming it's the "best model in the world for coding, agents, and computer use" as competition intensifies with OpenAI and Google. The model features reduced pricing at $5/$25 per million tokens compared to its predecessor's $15/$75, plus new capabilities like an effort parameter and enhanced computer use tools. However, real-world testing reveals a growing challenge: distinguishing between frontier models has become increasingly difficult, with improvements often marginal rather than transformative.
SMRTR provides this summary for quick context. The original article belongs to Daily.dev.
Read the original article