Only 1 LLM can fly a drone
SMRTR summary
A new drone simulation called SnapBench tested seven leading AI language models on their ability to pilot a virtual drone through a 3D world to locate and identify creatures. Only Google's Gemini Flash succeeded at the task, while more expensive and supposedly advanced models like Claude Opus and GPT-4 failed primarily because they couldn't master altitude control needed to descend and properly identify ground-level targets.
SMRTR provides this summary for quick context. The original article belongs to Hacker News.
Read the original article