A group of prominent authors, including Kai Bird and Jia Tolentino, has filed a lawsuit against Microsoft in New York federal court, alleging the tech giant used pirated versions of their books without permission to train its Megatron AI model. The case represents the latest in a series of high-stakes copyright battles between content creators and major tech companies over the unauthorized use of copyrighted material in AI development.
What you should know: The lawsuit alleges Microsoft used nearly 200,000 pirated books to train Megatron, an AI algorithm designed to generate text responses to user prompts.
• The authors claim Microsoft created “a computer model that is not only built on the work of thousands of creators and authors, but also built to generate a wide range of expression that mimics the syntax, voice, and themes of the copyrighted works on which it was trained.”
• The plaintiffs are seeking a court order to stop Microsoft’s alleged infringement and statutory damages of up to $150,000 for each work that was allegedly misused.
The big picture: This lawsuit emerges amid a broader legal battleground where authors, news outlets, and other copyright holders are challenging tech companies over AI training practices.
• Similar cases have been filed against Meta Platforms, Anthropic, and Microsoft-backed OpenAI over alleged misuse of copyrighted material.
• The timing is particularly significant, coming just one day after a California federal judge ruled that Anthropic made fair use of authors’ material for AI training but may still face liability for pirating books—marking the first U.S. decision on using copyrighted materials without permission for generative AI training.
Why this matters: The outcome could establish crucial precedents for how AI companies can legally access and use copyrighted content for training purposes.
• Tech companies argue they make fair use of copyrighted material to create transformative content, warning that forced payments to copyright holders could severely hamper the growing AI industry.
• The case highlights the fundamental tension between protecting creators’ intellectual property rights and enabling AI innovation that relies on vast datasets of existing content.
What’s next: Microsoft has not yet responded to requests for comment on the lawsuit, while attorneys for the authors declined to provide additional details about their legal strategy.