SubQ – a sub-quadratic LLM built for multi-million token reasoning
SMRTR summary
SubQ is a new AI model built on a sub-quadratic sparse-attention architecture, meaning it skips unnecessary calculations between words and focuses only on what matters. At 12 million tokens, this cuts attention computing by nearly 1,000 times, making it 56 times faster than FlashAttention-2. It handles full code repositories and long agent tasks in one prompt, rivaling top models like GPT-5 and Claude Opus on key benchmarks.
SMRTR provides this summary for quick context. The original article belongs to Hacker News.
Read the original article