Anthropic Finds LLMs Can Be Poisoned Using Small Number of Documents
SMRTR summary
Anthropic's research reveals that attackers need only 250 malicious documents in training data to create backdoor vulnerabilities in large language models, regardless of model size. The study tested models from 600M to 13B parameters and found this fixed number requirement makes poisoning attacks far more feasible than previously thought. Since creating 250 malicious documents is trivial compared to millions, this vulnerability could enable widespread attacks on LLMs.
SMRTR provides this summary for quick context. The original article belongs to Daily.dev.
Read the original article