Can LLMs earn $1M from real freelance coding work?
SMRTR summary
A recent OpenAI study assessed AI models' performance on real-world software engineering tasks, testing them on 1,400+ freelance jobs valued at over $1 million. The top-performing model, Claude 3.5 Sonnet, completed 33.7% of tasks, earning about $403,000. AI fared better at management tasks than implementation. Multiple attempts and increased "reasoning effort" improved results. While showing progress, AI still struggles with most complex coding projects.
SMRTR provides this summary for quick context. The original article belongs to Hacker News.
Read the original article