SMRTR AIJun 7, 2026Hacker News

Deep Dive into LLM Token Cost: How Prompt Caching Works

SMRTR summary

Prompt caching in Claude works in three distinct phases: a file enters as fresh input at full cost, gets written to cache at a 25% premium one turn later, then costs just 10% per turn afterward. In long sessions, this slashes costs dramatically, but gaps longer than five minutes expire the cache, making resumed sessions up to 12 times more expensive than active ones.

SMRTR provides this summary for quick context. The original article belongs to Hacker News.

Read the original article
SMRTR AI

Get the next batch of curated summaries in your inbox.

This archive is built from SMRTR newsletter summaries. Subscribe for hand-picked stories without the extra noise.

Related Stories

More SMRTR summaries that connect to this topic.

Browse AI
AIDaily.devDec 28, 2025

Prompt Caching Explained

Prompt caching stores a language model's internal state for unchanging prompt prefixes, allowing subsequent requests to skip reprocessing those tokens and achieve up to 80%...