Predatory Practices of Generative AI Art Sourcing Explained

• Miyazaki denounced generative AI art as “an insult to life,” yet OpenAI launched a tone-deaf “Ghiblification” feature mimicking his style.

• AI art tools are trained on copyrighted works without consent, exploiting artists while hiding behind vague “fair use” claims.

• Tech giants’ data-scraping mirrors surveillance capitalism, monetizing user data and creative output with zero transparency or compensation.

• Proposed laws like the Generative AI Copyright Disclosure Act aim to force dataset transparency, but enforcement burden still falls on creators.

• Without collective action or strong legal protection, artists risk being erased from their own industries by Silicon Valley’s profit-driven AI agenda.

Hayao Miyazaki had no kind words for generative AI art when he first witnessed an early model in Koganei, Japan in 2016. Years before generative image and video AI would become commercially accessible to consumers. Studio Ghibli had prepared a demonstration to show the founder AI’s potential in the future of animation work. If this meeting had gone well AI generated art might be a part of Ghibli films today.

Personally, if I were going to try to impress the visionary with Ghibli applicable AI art I wouldn’t have started with a rendering of a grotesque zombie crawling unnaturally across the floor. That would’ve spooked anyone seeing it for the first time. Miyazaki’s reaction was understandable:

“Whoever creates this stuff has no idea what pain is. I am utterly disgusted. If you truly want to make creepy stuff like this, you can go ahead. But I will never wish to incorporate this technology into my work. I strongly feel that this is an insult to life itself.”

You get two guesses whether Studio Ghibli adopted the technology.

a wooden judge's hammer sitting on top of a table — Photo by Wesley Tingey / Unsplash

I.

Fast forward to today and OpenAI has recently launched a viral feature on ChatGPT called Ghiblification where users can provide photos to be reimagined in Miyazaki’s style. A tasteless disrespect to one of the greatest art icons in human history. OpenAI is either tone deaf or directly hateful toward the legacy of Hayao Miyazaki. In either case, they’re deserving of getting checked by society.

Generative AI art companies have been sourcing copyrighted work without artist’s permission since their inception. A direct violation of intellectual property rights that mirrors the timeless colonial values. OpenAI doesn’t even try to wear a mask on the subject. Sam Altman, their CEO, has openly mentioned,

“it would be impossible to train today’s leading AI models without using copyrighted materials.”

Which reads like a confession that there is copyrighted work in ChatGPT datasets. These AI companies have pleaded fair use likening themselves to Google’s data-scraping practices. Except Google scrapes data from websites to help connect users to those websites they’re scraping. Whereas generative AI art companies are scraping copyrighted work with the intention of directly competing with those artists using their own work.

a person in a garment — Photo by Alex Shuper / Unsplash

II.

Ghiblification is the latest example of data exploitation. Silicon Valley is guilty of a litany of data surveillance issues that have yet to hit legislative desks. Google, Meta, Twitter, and Amazon have a history of extorting data from users in ways you wouldn’t expect. For example, Google doesn’t merely scan their own user’s email, but any email that makes contact with a Gmail account. These are the same names that have been scraping user data to sell to intelligence agencies and private companies for psychological profiling, and behavior modification.

This corporate cyber extortion is something Shoshana Zuboff would label Surveillance Capitalism. In her book, The Age of Surveillance Capitalism, she explains that people live in a data panopticon that corporations are able to scrape at will. What purpose that data will be used for is none of the business of the people that created it. They’ll have no right to what they created with their own hands.

Big tech corporations profit from user data as a business model. That’s why we get social media platforms for free. Because the posts we make, the products and services we buy, and the conversations we have with each other are all scraped to help companies make more money from that data. It’s so effective they don’t need to charge us.

What’s happening to AI artists is a consequence of another tentacle of the octopus experts are calling surveillance capitalism. An economic system where users and creators own nothing. Do so-called AI artists have IP rights to their art generations? No, legally work that’s created by non-human entities doesn’t qualify for intellectual property rights. The only part in the situation getting any rights at all are the tech companies. Coincidentally, they’re also getting all the money. The data-extraction business model is predatory and lacks transparency. That lack of transparency creates a space where artist rights aren’t protected.

low light photography of woman in gray knit sweatshirt writing on desk — Photo by Daniel Chekalov / Unsplash

III.

A necessary component of artists pursuing justice is transparency into AI datasets. Which AI companies have been privy to as they’ve increasingly made their datasets more difficult to identify. In 2018 OpenAI had been more transparent about their use of BooksCorpus (a dataset of 7,000 self-published books retrieved from smashwords.com largely protected under copyright). By 2020 OpenAI learned to label their training datasets for GPT-3 more vaguely (such as Books1 and Books2).

The EU AI Act and the proposed U.S. Generative AI Copyright Disclosure Act attempt to give artists rights in the situation, but place the burden of monitoring and enforcing rights on the artist. And if we’ve learned anything about industrialist resource-extraction tactics in Latin America, it’s that if victims don’t have a means to sufficiently enforce their rights then they will be gladly violated by corporations who need those resources.

Musicians have the benefit of being backed by music labels that will slam copyright strikes on their behalf. For example, United Music Group had tracks dropped from all streaming platforms featuring AI imitations of Drake and The Weeknd. Visual artists are going to need a powerful advocate whether public democratic agency or private corporate relationship like the music labels. It may not even be out of the cards for them to unionize and have a dedicated monitoring task-force.

That information would be pivotal for Studio Ghibli who could pursue two avenues for justice over the GPT Ghiblification feature. They could try to sue over the input or the output. They could win a court battle if copyrighted work from Studio Ghibli films were used in the training datasets, or if the output resembles trademarked characters. The prior is blocked by GPT’s vagueness short of a court-ordered investigation. Even if the case could blow through OpenAI’s veil of secrecy the whole case would hit a wall if it turned out the input data was fan art resembling the Ghibli style, which is not legally protected.

man in black shirt sitting beside woman in white shirt — Photo by Saúl Bucio / Unsplash

IV.

Artists need to push the Generative AI Copyright Disclosure Act. It’s been proposed, but not passed into law. If passed, the Act would force AI companies to disclose links to their datasets 30 days before releasing a product to consumers. Violations could see fines starting at $5,000.

Visual artists should take a cue from music artists who have superior protections from generative AI. Although the Generative AI Copyright Disclosure Act wouldn’t monitor datasets for violations, visual artists might not have to bear the burden either if they had algorithmic crawlers scanning for unauthorized use of copyrighted work. Much like the crawlers on social media platforms listening to music for violations. Combining a mechanism like this with the Generative AI Copyright Disclosure Act wouldn’t solve everything, but it’d be a great start in protecting artist’s rights.

While the controversy surrounding copyright in generative AI is complicated by elderly legislators with low internet literacy, and antagonistic non-artist netizens who see this obstacle to artists as a necessary step in progress, one thing remains true. Silicon Valley is applying a social and economic grip on society strong enough to break down democratic self-determination. Artists should be able to opt-out of these data-scraping practices if they don’t want their intellectual property automatically sourced by generative AI. And if they’re going to be incentivized into opting-it, then AI companies should be paying them for it.

Complete neglect of these artists, like the direct disrespect to Hayao Miyazaki, are another example of an unleashed profits over people philosophy from Silicon Valley technocrats. Without formal power structures defending the rights of individuals the tech bros demonstrate they have no issue walking over people’s rights in the name of what they call “progress.” But true progress orients all things to the service of people, not the disoriented evil of people at the service of things.

“I strongly feel that this is an insult to life itself.”

by Derek Guzman

Independent journalist in tech, art, and philosophy

Ghiblification is a Funny Way to Spell Colonialism

Predatory Practices of Generative AI Art Sourcing Explained

I.

II.

III.

IV.

by Derek Guzman

Footer navigation