AI’s Great (Training) Robbery: How Small Creators Can Fight Back

Benjamin Hiorns

Published 10/09/2025

AI models are feasting on creative work without so much as a “may I?” – vacuuming up art, writing, code, and more to feed their insatiable algorithms. But we’re not taking it lying down. From clever watermarking tricks to legal maneuvers and community-led opt-out tools, creators are finding ways to protect their IP and livelihoods in the age of rampant AI data grabs.

The Global Free-for-All on Creative IP

It’s a story as old as the internet itself: big helping itself to whatever content it can find without asking for permission. Only now it’s on an industrial scale, and the content in question is your creative work.

It’s effectively an uncompensated use of IP

Generative AI systems are trained on billions of images, songs, and texts scraped from the web. Often this is done without permission, credit, or compensation, leading creators to justifiably call it theft. It’s effectively an uncompensated use of IP. In practical terms, that means an AI could be spitting out art in your style or writing eerily similar to your blog posts, all because it devoured your portfolio during training. Small businesses and freelancers across the globe are watching in horror as years of work become grist for someone else’s algorithmic mill.

Legally speaking, the situation is a wild west with different rules in different regions. In the US, authors and artists have launched class-action lawsuits, arguing that companies “unlawfully copied and processed millions of images protected by copyright” to train AI models. (The harmed artists insist this is not a victimless crime – “the harm to artists is not hypothetical,” their lawsuit stresses.)

American writers’ groups like the Authors Guild are pushing for new laws and advising members to explicitly forbid AI training on their content. Over in Europe, regulators are taking a stricter stance: the EU’s copyright rules require that rights holders be given a way to opt out of text and data mining for commercial AI, which is why, for example, Meta had to offer an AI opt-out to users in the EU and UK when it wanted to train on Facebook/Instagram posts.

By contrast, countries like Japan have embraced broad exceptions that allow AI training on copyrighted material by default (much to creators’ chagrin). And the UK, after initially flirting with a free-for-all approach, is now leaning towards an opt-out regime that mirrors the EU model – meaning British creators might soon need to add “please don’t scrape my stuff” flags to their websites if they want to be excluded.

In short, it’s a patchwork of policies, and global creatives must navigate a confusing question: How do I protect my intellectual property when the law hasn’t caught up?

Why it Matters to the Little Guys

Polina Nikolova

If you’re a freelancer illustrator, an independent photographer, or a self-published author, you might be feeling a bit like David facing an army of Goliaths. The threat isn’t abstract or years away – it’s here now. Artists have already seen clients vanish and gigs dry up because a prompt in Midjourney or ChatGPT can imitate their style on the cheap.

Concept artist Karla Ortiz recalls the “disturbing” moment she discovered AI models had been trained on “almost the entirety of my work, and the work of almost every single artist I knew,” all without consent. To add insult to injury, users were explicitly prompting the AI with artists’ names – essentially using their reputation as a shortcut to copy their style.

Ortiz and others even heard of jobs and internships being lost to AI-generated substitutes by late 2022. When a major ballet company can use an AI-generated campaign instead of hiring a photographer or illustrator, as happened recently with the San Francisco Ballet for their latest Nutcracker performance (below), you know no creative job is truly secure.

Visual artists aren’t the only ones feeling the squeeze either. Writers, too, have found their stories and articles regurgitated by AI systems. A Google search AI feature began serving up direct answers mined from websites – great for users, perhaps, but a “slow strangulation” for the bloggers and journalists who depend on clicks.

Small content creators suddenly see less traffic and zero compensation while AI-driven “overview” snippets monetize their work. Musicians and voice actors have similarly watched (with mounting panic) as AI clones their voices and styles. The fundamental issue in all these examples is consent – or rather, the lack of it.

The playing field feels tilted steeply in favor of tech giants and against the independent creator trying to pay rent. But don’t cue the funeral dirge for human creativity just yet; around the world, the creative community is mobilising with an inspiring mix of tech savvy, legal action, and old-fashioned collective backbone.

Lawyers, Licenses and Lobbies

One avenue is, of course, the legal route – though a lone freelancer probably isn’t going to march into court against a Silicon Valley unicorn. That hasn’t stopped groups of creators from banding together. In the US, a group of artists spearheaded by Sarah Andersen, Kelly McKernan and Karla Ortiz filed a landmark class-action lawsuit in 2023 against Stability AI, Midjourney and DeviantArt for infringing their copyrights via AI training.

And in the UK, stock image giant Getty Images (hardly a “small” creator, but representing thousands of photographers) launched a high-profile suit against Stability AI as well, arguing that Stability “did not pursue the official licensing” and instead just helped itself to content in an “unlawful” manner.

What constitutes “substantial similarity” when an AI is remixing thousands of sources?

These cases are ongoing and raise tricky questions – how do you prove a specific AI output infringes on a specific training image? What constitutes “substantial similarity” when an AI is remixing thousands of sources? As of 2025, the answers are still up for debate in court. But the very act of suing has brought public attention and could lead to precedent that deters willy-nilly scraping. Even if you’re not directly involved in a lawsuit, you might eventually benefit from any legal clarity (or settlements) that emerge.

For individual freelancers looking for a more immediate shield, a bit of legalese in your contracts and copyright notices can help. The Authors Guild, for instance, recommends that writers add a “No AI Training” clause in their copyright page or online posts. Something along the lines of “Any use of this work to train AI is expressly prohibited – the author reserves all rights to such use” can put would-be infringers on notice.

It’s not a magic force-field (and admittedly, the bad actors may ignore it), but it establishes your intent and could bolster a future legal claim. Likewise, some software developers and artists have started using new licenses that explicitly ban training uses of their content – a sort of “DIY copyright upgrade” since current laws are lagging.

On the flip side, creatives are lobbying for stronger laws. Unions and guilds are calling on legislators to update copyright rules for the AI era, much like how the DMCA was introduced in the late ’90s to tackle internet copying. We’re now seeing proposals (in the US Congress, for example) to establish clearer intellectual property rights over one’s data, style, or likeness in AI contexts.

Small businesses and freelancers don’t have big lobbying budgets, but by supporting creative industry coalitions, they amplify their voice. In Europe, organizations like the Association of Illustrators and Creators’ Rights Alliance campaign for artist-friendly regulations. Joining such groups or at least keeping informed through them can turn a lone battle into a broader movement.

Finally, there’s the route of licensing and negotiation – turning the tables and making the AI companies come to you for permission (and payment). This might sound far-fetched for a solo creator but consider how the industry is evolving.

Shutterstock, a major stock image marketplace, decided to partner up instead of fight: it struck deals to license content for AI training (with OpenAI and Meta among others) and even set up a fund to compensate artists whose works are used. In effect, they said “if you’re going to use our contributors’ images, let’s do it above-board and share the benefits.”

Now, a small illustrator likely can’t negotiate with OpenAI on their own – but you can choose to sell or license through platforms that have these arrangements or to future collectives that might act as bargaining agents for creators. It’s a bit like musicians licensing their songs for commercials: if AI firms need training data, perhaps tomorrow they’ll be buying ethical datasets curated from creators who opt in for a fee.

We’re already seeing early moves in this direction, which could eventually give freelancers another income stream (imagine being paid for your past work’s contribution to some AI’s smarts!). Until that day comes, though, most are focusing on keeping their work out of the AI maw – which brings us to the technical defenses.

Beating the Bots with Opt-Outs and Roadblocks

If the AI scrapers are like roaming robots trying to vacuum up every creative crumb on the internet, one tactic is to put up some digital “keep out” signs. Many smaller businesses maintain their own websites or portfolios, and here a bit of tech savvy can go a long way. By adding rules to your site’s underlying files, you can tell AI crawlers to get lost – or at least ask them politely to stay away.

OpenAI, for example, has introduced a special tag for its GPTBot and if your site disallows it, they’ve said they won’t scrape your site. Similarly, you can add meta tags in your HTML which signal to compliant bots that you’re opting out of AI training.

Now, an important caveat: these measures rely on ‘honour’ so a scrupulous AI company will respect your opt-out, but a rogue one might not. Still, major players under public scrutiny are likely to comply, especially as regulations nudge them. Google’s new AI search crawler or OpenAI’s GPTBot don’t want the PR (or legal) nightmare of ignoring explicit opt-outs.

For those less tech-inclined, good news: there are tools that make this easier. Spawning, an artist-driven AI startup, offers Kudurru, a tool (currently in beta) that actively blocks known AI scrapers from accessing your site. They even have a WordPress plugin, meaning if your portfolio or blog runs on WordPress, you can install Kudurru to do the blocking for you.

Kudurru doesn’t just ask scrapers to leave, it can apparently detect and thwart them – acting like a bouncer who not only turns away uninvited guests at the door but also catches those trying to sneak in the back window. There’s also an emerging concept of “Do Not Train” registries.

Spawning’s other well-known service, Have I Been Trained, lets you search a massive dataset (LAION-5B) to see if your art or photos are in there. If they are, you can click to add them to a Do Not Train list that some AI firms have agreed to honour.

Stability AI (creators of Stable Diffusion) and Hugging Face (a major AI model hub) have publicly stated they will remove items on that registry from future training datasets. This won’t magically yank your content out of models that are already trained (what’s done is done), but it can stop future or ongoing scraping in its tracks.

Think of it as putting your work on an “AI no-fly list”

Think of it as putting your work on an “AI no-fly list.” It’s not comprehensive (plenty of AI projects might ignore the registry) but if the big names comply, that’s a chunk of the problem mitigated.

Even social media platforms are getting in on opt-outs (sometimes under regulatory duress). As mentioned, Meta now offers EU and UK users a way to object to their posts being used in training datasets. It’s a buried setting (of course it is), but it exists.

Smaller content platforms are also responding to user pressure: DeviantArt introduced an opt-out tag for images (so artists can mark their pieces “noAI” to exclude them from DeviantArt’s own AI art generator training) and sites like ArtStation now allow creators to flag “do not train” on uploads after artists staged protests by plastering “NO AI” on their portfolios.

In the 3D printing community, marketplaces like Cults3D straight-up prohibit AI use of any designs on their platform. All these measures share a common theme – creators carving out a little safe zone where their work isn’t freely up for grabs. They’re imperfect shields (and sometimes only region-specific), but they’re certainly better than throwing up your hands and doing nothing. If you have a web presence, spending an hour to implement a couple of these opt-outs can be well worth the peace of mind.

Poisining the AI Well

What about the images (or text, or audio) you’ve already shared with the world? You want people to see and enjoy your work – you just don’t want it swept into some machine learning model without consent. Enter the idea of watermarking and cloaking, the art of sharing your content in a form that’s still fine for humans but garbled for AI.

This is where things get delightfully nerdy. Researchers at the University of Chicago have developed free tools like Glaze and Nightshade that let artists add subtle perturbations to their images. To the human eye, a “glazed” artwork looks almost identical to the original. But to an AI’s mathematical gaze, that image might appear to be in a completely different style or subject.

Watermarking is the art of sharing your content in a form that’s still fine for humans but garbled for AI

For example, a painting of a cat processed through Glaze might trick an AI into learning it as a painting of a pumpkin (just as an illustration). The result? If that AI later tries to generate something in your style, it produces nonsense or at least not a faithful imitation of you.

Glaze focuses on protecting style – it “cloaks” the style by shifting the AI-visible patterns. Nightshade goes further by altering the content – an image of a cow might register as a picture of a handbag to the AI.

These are like feeding the AI poison pills hidden in tasty treats: the more it gobbles your cloaked art, the more messed up its understanding becomes. According to the Glaze team, these measures survive common tactics like resizing or slight filtering, so an AI can’t easily wash off the “invisible ink.”

It’s worth noting that such defensive tech is in a cat-and-mouse race with AI developers (who, one imagines, aren’t too pleased about their models getting indigestion from poisoned data). The Glaze project warns this isn’t a permanent fix – it’s a present solution while the bigger battles play out. Cloaking also works better on some art styles than others (if your style is super minimalist or flat, the perturbations might become noticeable to humans). Still, for many digital artists it’s a godsend; Glaze has already been downloaded by tens of thousands of creators eager to defend their signature look.

In fact, entire platforms are springing up around this: there’s even a site called Cara for sharing “glazed” art specifically, integrating the tool so artists can safely showcase their portfolios.

For a more old-school approach, there’s always the trusty watermark. We’re not talking about the obvious “© Your Name” slapped in a corner (though that can at least make clear who the author is). The new wave is invisible or hard-to-detect watermarks that confuse AI training.

One initiative, ArtShield, offers a free “Watermarker” service that embeds a robust invisible pattern into your image. The idea is to camouflage your art from AI scrapers. Perhaps the watermark causes the AI to think the image is corrupted or triggers it to throw the data out as noise.

To a person viewing the image on your website, it looks normal; to a crawler or an AI model, it’s booby-trapped. Similarly, the Mist tool by the Psyker team adds an imperceptible watermark that, when learned by an AI, causes any generated images to come out with a glaring ugly watermark across them.

In other words, if someone unknowingly trains on your “misted” artworks, their AI will later produce telltale watermarked images that are pretty much useless to the end-user. It’s a clever form of revenge: steal my art for your AI, and your AI becomes a plagiarist that can’t cover its tracks. Some creators are experimenting with noisy textures or hidden patterns manually added to backgrounds of images as well – even a faint grid or abstract squiggle that isn’t obvious to viewers can throw off an AI’s training if done right.

Of course, traditional visible watermarks shouldn’t be dismissed either. Many photographers and illustrators still use prominent watermarks online to discourage casual theft. An AI might still ingest a watermarked image (some early text-to-image models famously learned to regurgitate Shutterstock’s watermark in their outputs because it was so common in the training data!), but at least that watermark’s presence is a constant reminder of provenance.

Some experts have floated the idea that AI-generated images should carry a watermark

In fact, some experts have floated the idea that AI-generated images should themselves carry a watermark indicating they’re AI-made, both to protect consumers from deception and to credit the original data sources. That’s more on the policy side (a kind of watermark mandate) and is being debated at high levels.

For the individual creator, the takeaway is you have options to mark your territory. Whether it’s a stealth pixel cloak or an old-fashioned signature, these measures let you strike a balance: share your work with the world, but on terms that don’t make it easy for a machine to appropriate.

Solidarity over Defeatism

Perhaps the most heartening development in this uphill battle is the way creators are coming together. It’s easy to feel isolated but the reality is a community of freelancers, small studios, and allies is forming to tackle the problem collectively.

Online forums and Discord groups have popped up (the r/aiwars subreddit, for one, is literally dedicated to news on “all sides of the AI art debate”) These communities share tips, tools, and moral support. One artist might alert others, “Hey, I found our stuff in such-and-such dataset, here’s how to opt out.”

Developers who care about art have volunteered their skills to build tools like ArtShield and Glaze – labours of love to protect the creative commons. A group of “nerds in Pennsylvania,” as ArtShield describes its team, built their tool purely to help fellow artists defend themselves.

This grassroots camaraderie is something the AI companies didn’t count on. Each individual action (be it adding a meta tag or signing an open letter) might seem small, but multiplied across thousands of creators it becomes a movement. We see this in the public pressure put on platforms and legislators.

The UK’s pivot to include opt-outs came after musicians and artists from Paul McCartney to Kate Bushslammed the idea of free-for-all data mining. Adobe had to quickly reassure users that it wouldn’t train its AI on their Photoshop files after illustrators erupted in protest over a confusing policy update.

And let’s not forget the striking Hollywood actors and writers (the SAG-AFTRA and WGA strikes in 2023) who made AI a central contract issue – they stood up on behalf of even the smallest background actor or scriptwriter who didn’t want to be digitally replicated without pay. When big industries and independent creatives align on a cause, change tends to follow.

For freelancers and small businesses, a powerful step is to join professional associations or guilds in your field. Not only do these groups lobby on your behalf, but they often provide resources: template contracts with AI-use clauses, legal advice hotlines, and educational webinars on protecting IP.

They also foster that reassuring sense that you’re not alone in this fight. As the True Grit Texture Supply team (which cheekily dubbed their guide “How to Protect Your Art from Big AI”) put it, the aim isn’t to panic or turn every artist into a full-time tech ninja, but to inform and empower creators to do what’s feasible for them.

Maybe you can’t implement every single strategy (few of us have time to both create and endlessly safeguard our creations) but even doing one or two things can make a difference. And importantly, raising a collective ruckus is working: the conversation around ethics in AI has shifted, with more people acknowledging that creators deserve respect and reward. There is even a budding market for ethical AI training data (who’d have thought we’d see startups promising “fair trade” data?).

Ultimately, protecting your IP in the AI era isn’t about completely stopping the technology (that ship has sailed) – it’s about asserting your rights and value within this new landscape. It’s about saying yes, you can innovate, but not at my expense or without my permission.

That stance is beginning to be heard. From a candid, witty editorial in a creative industry magazine to formal hearings in government halls, the message is resonating that creators’ work is not just some free buffet for any algorithm to chow down on.

Will the AI companies and content creators find a harmonious middle ground? Possibly – if enough pressure is applied and practical solutions take hold. In the meantime, smaller creatives are not helpless.

Art isn’t dead and there will always be a demand for great work made by real life humans

You have tools to wield, from legal notices to code snippets to community action. Use them. Share knowledge with your peers. Push for better norms (like clients agreeing not to use your deliverables to train AI without a license). And perhaps most importantly, continue to create with the confidence that your human artistry has enduring value.

In a world increasingly flooded with machine-made content, authenticity can become a selling point. Your job is to protect that authenticity and demand it’s respected. AI may be here to stay, but so are the world’s scrappy, talented freelancers – and they’re learning to stand their ground.

Benjamin Hiorns