SMRTR AIAug 28, 2025Daily.dev

Are OpenAI and Anthropic Really Losing Money on Inference?

SMRTR summary

Analyzing AI inference costs shows a stark contrast between input and output token economics. Using a 72 H100 GPU cluster at $2/hour per GPU, input processing costs about $0.003 per million tokens while output generation costs $3.08 per million—a thousand-fold difference. This explains why services like ChatGPT and Claude Code can be profitable despite heavy usage. API businesses enjoy 80-95% gross margins, challenging the notion that AI inference is unsustainably expensive. The economics favor applications processing large inputs with minimal outputs, while video generation remains costly due to its reverse pattern.

SMRTR provides this summary for quick context. The original article belongs to Daily.dev.

Read the original article
SMRTR AI

Get the next batch of curated summaries in your inbox.

This archive is built from SMRTR newsletter summaries. Subscribe for hand-picked stories without the extra noise.