Vision AI models see optical illusions when none exist
SMRTR summary
A duck is just a duck, unless you're an AI. Then it might be a rabbit too.
Large language models like ChatGPT are seeing optical illusions where none exist, according to research from Harvard psychologist Tomer Ullman. When shown a simple photo of a duck, ChatGPT confidently identified it as the famous duck-rabbit illusion, even attempting to highlight both animals in what became a statistical chimera.
"I'm generally hesitant to map between mistakes these models make and mistakes people make," Ullman explained, suggesting the error is more akin to humans applying familiar solutions to problems they haven't fully processed.
This disconnect between visual perception and language understanding raises significant concerns as these technologies are deployed in robotics and other critical applications.
While some might call this phenomenon "hallucination," Ullman prefers more precise terminology, noting that "the term 'hallucination' has kind of lost meaning in current research."
The study evaluated eight different AI systems, finding that even the most sophisticated commercial models – GPT-4, Claude 3, and Gemini 1.5 – consistently see patterns that simply aren't there.
SMRTR provides this summary for quick context. The original article belongs to Daily.dev.
Read the original article