SMRTR AIJun 22, 2026Hacker News

GLM-5.2 ranks better than GPT-5.5 in new agentic knowledge work eval

SMRTR summary

Claude Fable 5 tops a new AI benchmark called AA-Briefcase, which tests models on real-world tasks like financial modeling and strategy work using thousands of messy files. GLM-5.2 ranks third overall but leads open-weight models, beating GPT-5.5 while costing less than 25% of Claude Opus 4.8's price.

SMRTR provides this summary for quick context. The original article belongs to Hacker News.

Read the original article
SMRTR AI

Get the next batch of curated summaries in your inbox.

This archive is built from SMRTR newsletter summaries. Subscribe for hand-picked stories without the extra noise.