SMRTR AI• Jun 22, 2026• Hacker News

A Theory of Why Prompt Injection Works

SMRTR summary

LLMs process everything — user prompts, system instructions, webpage data — as one continuous stream of text, relying on role tags to distinguish commands from data. Researchers found that LLMs identify roles by writing style rather than actual tags, making them vulnerable to "CoT Forgery" attacks that mimic reasoning style to hijack model behavior, pushing jailbreak success rates from near-zero to 60%.

SMRTR provides this summary for quick context. The original article belongs to Hacker News.

Read the original article

A Theory of Why Prompt Injection Works

Get the next batch of curated summaries in your inbox.