Deep Dive into LLM Token Cost: How Prompt Caching Works
SMRTR summary
Prompt caching in Claude works in three distinct phases: a file enters as fresh input at full cost, gets written to cache at a 25% premium one turn later, then costs just 10% per turn afterward. In long sessions, this slashes costs dramatically, but gaps longer than five minutes expire the cache, making resumed sessions up to 12 times more expensive than active ones.
SMRTR provides this summary for quick context. The original article belongs to Hacker News.
Read the original article