Inside Meta’s race to beat OpenAI: “We need to learn how to build frontier and win this race”
SMRTR summary
Meta's internal communications reveal plans to use copyrighted data, including from book piracy site LibGen, to train AI models like Llama. The company aimed to match OpenAI's GPT-4 capabilities while implementing measures to conceal data sources and mitigate potential legal and regulatory risks. This approach highlights the competitive race among AI companies to acquire training data amid growing concerns about copyright infringement and data scarcity.
SMRTR provides this summary for quick context. The original article belongs to Daily.dev.
Read the original article