Unlocking a Million Times More Data for AI
SMRTR summary
Researchers dispute the "peak data" claims made by AI industry leaders, revealing that top AI models use only terabytes of training data while the world has digitized 180 zettabytes—a million times more data. The proposed Attribution-Based Control framework would allow data owners to maintain control of their data while contributing to AI development, potentially unlocking vast private datasets through a government-led program modeled after ARPANET.
SMRTR provides this summary for quick context. The original article belongs to Hacker News.
Read the original article