SMRTR AIJan 25, 2026Daily.dev

The Hidden Attack Surface in Every LLM: How Special Tokens Enable 96% Jailbreak Success Rates

SMRTR summary

Security researchers discovered that attackers can exploit special tokens in large language models to achieve a 96% jailbreak success rate by injecting control sequences like <|im_start|> directly into user input. These tokens, designed to structure conversations and mark role boundaries, are treated as authoritative commands rather than regular text, allowing attackers to hijack system instructions and bypass safety measures. The vulnerability affects major models including GPT-3.5, GPT-4, and LLaMA, with techniques ranging from direct role-switching to invisible Unicode payload injection that evades detection systems.

SMRTR provides this summary for quick context. The original article belongs to Daily.dev.

Read the original article
SMRTR AI

Get the next batch of curated summaries in your inbox.

This archive is built from SMRTR newsletter summaries. Subscribe for hand-picked stories without the extra noise.