Guide

What is AI Training Data?

How artificial intelligence models are trained on creative work, and what that means for the people who made it. A primer for creators.

Read the guide
How-To

How to File a DMCA Takedown

Step-by-step instructions for filing a DMCA takedown notice when your creative work appears online without permission.

Read the guide
Checklist

Is Your Work in a Dataset?

How to check whether your creative work has been included in AI training datasets without your consent or knowledge.

Read the guide
Guide

Class Action Guide

What class actions are, how they work, and how individual creators can join lawsuits against companies that stole their work.

Read the guide
Checklist

Documenting Infringement

A creator's checklist for documenting copyright infringement. What to save, how to timestamp, and what evidence matters.

Read the guide
Reference

Statute of Limitations

What creators need to know about copyright statute of limitations, filing deadlines, and preserving your right to sue.

Read the guide
Reference

Data Centers: The Internet's Body

Where data centers are, what they consume, how communities are responding, and what happens if AI architecture changes.

Read the guide

Tools

AI Training Database

Search our database of known AI training datasets. Find out whether your creative work — images, writing, music, or code — may have been used to train AI models without your permission.

We've catalogued the major publicly-known AI training datasets — LAION, Common Crawl, Books3, The Pile, GitHub code repositories, and more. Each entry includes what's in the dataset, known copyright concerns, relevant lawsuits, and tools for checking whether your work appears in it.

Search the Database