Can modern LLMs actually count the number of b's in "blueberry"?
SMRTR summary
OpenAI's GPT-5 shows surprising weaknesses in simple letter counting tasks. When asked how many b's are in "blueberry," GPT-5 often incorrectly answers "three" despite there being only two. Tests across multiple leading LLMs reveal inconsistent performance, with some models like Claude getting it right consistently while others struggle, challenging claims about AI's reasoning capabilities.
SMRTR provides this summary for quick context. The original article belongs to Daily.dev.
Read the original article