Adobe hit with AI lawsuit over pirated book claims
A new class-action suit accuses Adobe of using pirated books in AI training. Here's what this means for content creators and marketers.
Another tech giant is under legal scrutiny for how it built its AI. Adobe is now facing a proposed class-action lawsuit that alleges the company used pirated books—including those written by the lead plaintiff—to train its SlimLM model. The case adds more heat to the ongoing legal storm over AI and copyrighted content.
For marketers and content creators who rely on GenAI tools to accelerate campaigns, this case is more than just another tech headline. It raises urgent questions about the legality of the data behind AI tools, the reputational risks of using them, and what responsible use of GenAI should look like going forward.
This article breaks down the lawsuit, explains the broader trend of legal action in AI development, and highlights what marketers should do now to avoid future fallout.
Short on time?
Here is a table of content for quick access:

What happened?
The lawsuit was filed by author Elizabeth Lyon, who claims Adobe used pirated versions of her nonfiction books in the training data for its SlimLM language model. Adobe describes SlimLM as a small language model optimized for document-related tasks on mobile devices.
According to the complaint, SlimLM was trained using SlimPajama-627B, a dataset released by Cerebras in June 2023. This dataset is said to be derived from another set known as RedPajama, which includes the Books3 dataset. Books3 contains more than 190,000 books and has been cited in several copyright lawsuits involving Apple, Salesforce, and now Adobe.
Lyon argues that because SlimPajama includes content from Books3, Adobe effectively trained its AI on copyrighted material without permission. The lawsuit alleges that this dataset was compiled and manipulated in ways that violate copyright laws and hurt authors whose work was scraped without consent.
Adobe has yet to release a public statement about the case.
Why this keeps happening
This isn’t the first lawsuit targeting how GenAI systems are trained. Apple, Salesforce, and Anthropic have all been dragged into similar legal disputes. Just a few months ago, Anthropic agreed to pay US$1.5 billion to settle claims that it used pirated works to train its chatbot, Claude.
The legal issue boils down to this: AI models need massive amounts of data to become useful. In the rush to build smarter tools, many companies have used open web data sets that included everything from Wikipedia entries to full-length books—sometimes without verifying the licensing status of that content.
For marketers, that means some of the AI tools currently in use could be powered by data obtained through questionable means. The potential legal and ethical risks are no longer abstract. They are becoming real enough to land in court.
What marketers should know
Whether you're using AI to generate blog copy, automate customer support, or produce social visuals, this lawsuit is a wake-up call. The tools may be efficient, but their training data could come with hidden liabilities. Here are four things every marketer should do now.
1. Know where your AI gets its data
Ask your vendors how their models were trained. If they can’t tell you, that’s a red flag. Look for AI tools that are open about their training data or use properly licensed, human-curated sources.
2. Audit your AI-driven content workflows
Review how and where AI is used in your content production. Document which tools are in use and what type of content they generate. This will help you respond quickly if a legal challenge arises around content origin.
3. Include AI indemnity in contracts
Make sure your contracts with AI vendors include clauses that protect your company from liability if the model was trained on infringing data. Legal teams should be proactive about this going forward.
4. Build a responsible AI use policy
Start creating internal guidelines for how AI is used in your marketing and content teams. Include standards around transparency, attribution, and when human review is required. This is key for both compliance and trust.
Adobe’s latest legal challenge is part of a growing pattern. As AI becomes a default part of marketing, the pressure to clean up its training methods is building fast. Marketers can no longer afford to treat AI tools as black boxes. From legal exposure to brand safety, the risks tied to data misuse are rising.
Staying informed, asking the right questions, and putting strong policies in place now will help marketers stay ahead of the curve and avoid getting caught in the crossfire of future lawsuits.


