They used your content to train AI...

and without telling you!

Let’s be clear, you didn’t opt in.

Over the last year, major AI companies, including Apple, Nvidia, Anthropic, and Salesforce, have trained large language models using YouTube transcripts from over 173,000 videos across 48,000+ channels.

They used:

  • Auto-generated captions

  • Closed captions

  • In some cases, full transcripts

    And they scraped it all without consent, credit, or compensation.

They’re not just reading the news. They’re eating it. And we’re the meal.

What This Means for You

  • Your voice, insights, and storytelling are being ingested by AI models

  • Those models may go on to replicate your style, summarize your content, or generate competing work

  • You don’t get paid. You don’t get credit. You may not even know it’s happening

This isn’t theory, this is confirmed, documented, and already in use.

What You Can Do (Right Now)

1. Shield Yourself on YouTube

Go to:
→ YouTube Studio → Settings → Advanced Settings
→ Toggle OFF “Allow AI to train on my content”

This only applies to YouTube’s internal AI tools (like Google Bard, Gemini, or DreamFusion). This does not stop 3rd parties from scraping your content (which is annoying).

2. Tag Your Content “Do Not Train”

Add this to your descriptions and captions:
[do not train on this content]

Some tools (like Spawning.ai) are working to honor these tags in AI datasets. It may help build legal precedent later, like Creative Commons did for attribution.

3. Check if You’ve Been Used

Use tools like:

  • HaveIBeenTrained.com to search image datasets

  • Search your video titles in public AI training datasets (like LAION, Pile, C4)

  • Stay tuned to lawsuits like NYT vs. OpenAI, these may open the door to future compensation

⚖️ Legally?

It’s a gray zone, but not hopeless.

  • U.S. Copyright law protects your original content

  • But AI companies argue “fair use” for training on public data

  • Lawsuits are underway to test that argument

If precedent is set in favor of creators, it could:

  • Lead to required licensing

  • Open the door for class-action-style compensation

  • Force platforms to notify and credit creators

Can You Get Paid for This?

Not yet. But here’s what’s brewing:

  • Platforms like Spawning.ai and Fairly Trained are lobbying for an “AI license” model, where creators opt-in and get paid

  • Think “Spotify of AI content”…. permission-based, trackable, and monetized

  • If you’re early to this movement, you’re better positioned when it arrives

TL;DR

  • AI companies scraped your YouTube transcripts without asking

  • You can opt out of YouTube’s training, but not 3rd parties

  • Use “do not train” tags, check datasets, and track upcoming lawsuits

  • Legal and financial options are on the horizon, but only if creators speak up and act now

  • If you want help navigating future AI licensing or identifying companies that are using your content for their own commercial trainings, email me [email protected]. I’d love to help!