Is your company’s AI training data breaching copyright law?

Generative AI is shaking up everything – including how we think about copyright. And right now, there are more questions than answers.

Picture this: your marketing team uses a generative AI tool to whip up campaign content. It looks great. It’s fast. Everyone’s happy. But what if that tool was trained on copyrighted material – and no one asked permission?

That’s not a hypothetical. It’s the legal grey zone companies across the UK and EU are already in. And for in-house legal teams, it means one more risk to navigate – one that’s poorly understood and barely regulated.

Here’s what you need to know – and how to steer your business safely.

Why the copyright question matters

The key issue is this: generative AI tools are trained on vast datasets – often scraped from the internet – that include books, images, articles, and more. Many of those materials are protected by copyright.

When an AI model ingests that content to learn patterns and generate outputs, it technically makes copies – and under EU copyright law, that’s a legally protected act.

So, does training an AI model on copyrighted content amount to copyright infringement?

It depends. And that’s the problem.

Text and data mining: useful, but limited

Under the EU’s Copyright in the Digital Single Market (CDSM) Directive, there’s a “text and data mining” (TDM) exception. Article 4 allows developers to mine copyright-protected works – but only if rightsholders haven’t opted out.

In theory, that sounds like a fair balance. In practice, it’s a mess.

There’s no standard way to opt out. Rightsholders are using everything from metadata to robots.txt files. Developers are unsure what counts as a valid opt-out. And AI systems trained on massive, unstructured datasets can’t reliably filter out opted-out works.

The result? Legal uncertainty for businesses using – or building – generative AI tools.

AI-generated content: who owns what?

The next puzzle: what about the outputs? If an AI tool generates a report, a piece of code, or a design – is it protected by copyright?

Under current EU law, only works with human authorship qualify. That means fully machine-generated content typically falls into the public domain.

But what if a human guides the process – say, tweaking prompts or refining outputs?

That’s where the law gets murky. Different Member States interpret “authorship” differently. So businesses using AI-assisted content need to tread carefully – especially if they plan to commercialise it or enforce rights over it.

The “value gap”: creators want payback

Another headache for rights-holders – and a reputational risk for AI adopters – is the growing outcry over fair remuneration.

Artists, authors, and musicians are rightly asking: why should AI companies profit from our work without paying us?

Policymakers are listening. The European Parliament is exploring options like statutory licensing schemes and traceability requirements (think watermarking and usage logs) to rebalance the playing field (see full report) .

If those reforms land, businesses using AI tools may face new compliance burdens – and costs.

What in-house legal teams should do now

Until the rules catch up, legal teams need to get proactive. Here’s how:

  • Audit your AI tools. What models are being used? What do their terms say about training data and output rights?
  • Push for transparency. Ask vendors how their models were trained and whether any opt-out mechanisms were respected.
  • Update IP policies. Clarify ownership of AI-assisted work, especially where employee or contractor input is involved.
  • Watch for legal changes. The EU’s AI Act and possible copyright reforms could reshape the rules within the next year.
  • Support the business. Help your teams understand the risks and make informed choices about AI use.

The bottom line?

AI is a powerful tool – but it’s built on a fragile legal foundation. As the tech gallops ahead, copyright law is playing catch-up.

For in-house lawyers, this is your moment to lead – helping your business innovate safely, while respecting the rights that fuel creativity in the first place.

the plume press

THE NEWSLETTER FOR IN-THE-KNOW IN-HOUSE LAWYERS

Get the lowdown on legal news, regulatory changes and top tips – all in our newsletter made especially for in-house lawyers.

sign up today