SMRTR AI• Aug 5, 2025• Interesting Engineering

New gpt-oss model from NVIDIA and OpenAI hits record 1.5M tokens per second

SMRTR summary

OpenAI and NVIDIA have released two powerful open-weight language models, gpt-oss-120b and gpt-oss-20b, that achieve record speeds of 1.5 million tokens per second on NVIDIA's hardware. These Apache 2.0-licensed models deliver advanced reasoning capabilities comparable to proprietary systems, with the larger version matching OpenAI's o4-mini while the smaller model runs efficiently on devices with just 16GB of memory, making cutting-edge AI accessible to developers worldwide.

SMRTR provides this summary for quick context. The original article belongs to Interesting Engineering.

Read the original article

New gpt-oss model from NVIDIA and OpenAI hits record 1.5M tokens per second

Get the next batch of curated summaries in your inbox.