DeepSeek V4's indexer dies at 65K. We got it to 1M on 6GB
SMRTR summary
StreamIndex, a Triton-based tool, fixes DeepSeek V4's 65K token limit by processing data in chunks instead of all at once, extending capacity 32x to 1 million tokens while cutting GPU memory use from 256GB to just 6GB.
SMRTR provides this summary for quick context. The original article belongs to Hacker News.
Read the original article