Learning from the best, paying none: the Copyright Dilemma in the age of Generative AI.

“Intellectual property shall be protected”, states the EU Charter of Fundamental Rights. However, new and powerful Artificial Intelligence-based tools are driving major changes in the creative industry. Are then these words destined to remain just a motto?

calendar Nov Tue 11 2025

“Hello Chat, could you transform this picture into a Studio Ghibli style inspired image?”.

We all know what the output looked like: soft, ethereal, pastel-coloured scenes flooding our social media feed, strikingly reminiscent of the works of the famous Japanese artist Hayao Miyazaki. Like many online trends, it faded away within a couple of months, and yet it lasted long enough to raise very serious questions about copyright in AI era.

Has Altman’s company violated the law? Was that an act of copyright infringement? Was the AI model trained on Miyazaki’s copyrighted work, without consent or given credit and compensation?[1]

These questions broader apply to all Large-Language Models (LLMs) and Generative AI systems, which are known to train on large databases of text, image and other (supposedly) public material available on internet or provided by users.

Just before the Ghibli-style portraits went viral, another scandal broke. In January 2025, a group of authors filed a copyright infringement lawsuit against Meta in the US[2]. The plaintiff accused Zuckerberg’s company of scraping the LibGen database – a well-known repository of pirated books - to train its AI model[3]. No credits, of course.

Now, our goal is not to weigh in on the evidence in these lawsuits, but to consider what they reveal about a wider trend: every leap in AI sophistication brings with it new and deeper challenges for law and ethics.

The knot of copyright and data mining.

General-purpose AI (GPAI) models rely on deep learning techniques that require a large dataset of both structured and unstructured data, data scraping or automatic extraction of AI training data from the web and online sources, and data mining as automated techniques used to analyse the data[4].

To better understand the legal implications, let’s now turn to EU Copyright Law and the unprecedented challenges coming from GPAI and its hunger for data.

The Digital Single Market (CDSM Directive, Directive (EU) 2019/790)[5] aims to harmonise copyright and related rights within the internal market and introduces some exceptions concerning Text and Data Mining (TDM) purposes.

Article 3 covers TDM for scientific purposes[6]. Article 4 permits the reproduction and extraction of works for general data mining by users, including commercial entities, provided two conditions are met[7]: the works must be lawfully accessible, and rights holders must not have explicitly reserved their rights. A major point of contention, however, is the profound ambiguity surrounding the term "lawful access".

Under Article 4(3) of the Directive, rightsholders can "opt-out" of TDM by embedding a machine-readable notice in their online content. However, the absence of a harmonized technical standard for this opt-out mechanism creates significant ambiguity regarding which works are legally excluded. This practical shortcoming risks undermining rightsholder’s control rather than ensuring it. Furthermore, the Directive imposes no obligation on AI developers to retrain models or delete previously ingested material[8], resulting in LLMs training datasets frequently including copyrighted works[9].

Additional obligations surface in the European AI Act, specifically in provisions Article 53(1)(c) and (d). Provision (c) mandates General-Purpose AI Models Providers to comply with EU copyright law, while (d) requires them to “draw up and make publicly available a sufficiently detailed summary about the content used for training of the general-purpose AI model, according to a template provided by the AI Office”[10]. Although the recently published GPAI Code of Practice aims to guide compliance with this framework, its guidelines currently remain non-binding.

In conclusion, despite a broad consensus that AI developers must comply with European copyright law, significant legal uncertainty persists, especially concerning the TDM exceptions. Even where the inclusion of copyrighted material in LLMs' training datasets appears permissible under certain conditions, the act of training itself may still constitute unauthorized reproduction, thus exposing developers to potential copyright liability[11].

Intellectual property as fundamental right.

The boundary between style and substance represents another tide knot.

In an article on the Ghibli affair[12], the European Innovation Council and Small and Medium-sized Enterprises Executive Agency (EISMEA) questioned whether AI companies should obtain licenses or compensate creators whose works were used as training material. While the style is not subject to any intellectual property restriction, the represented work most certainly is.

The ability of AI technology to replicate artist’s work poses several challenges for rightsholders.

According to a European Commission study on copyright and new technologies[13], a large number of surveyed rightsholders (47%) agree that AI-copies might affect them more than human-made imitations, due to the extreme precision and speed of generative tools. Moreover, 77% of them strongly agree that this might drive some artists out of business.

The hidden risk is that the entire creative sector may be exploited. Writers, designers and illustrators, actors and actress, filmmakers, and any kind of rightsholders – especially small creators and freelancers – face the threat of being displaced by hyper fast and widely accessible generative tools that replicate their work without consent, credit or compensation. The artificial intelligence will do enough, eventually even better, because it has learned from the best. Increasing productivity at zero costs.

Artists deprived of their style cannot rely on copyright law either, since style falls outside the scope of protection. In a legal framework built upon economic compensation, can moral rights be exercised to oppose the use of ones’ work to train AI models?

In one of the EU policy scenarios[14], 67% of participants agree that rightsholder should be able to oppose against the processing of their work for AI training. However, extending copyright law to cover style is not seen as the best solution (41%)[15]. Instead, many participants prefer to rely on unfair commercial practices claims when an artist’s style is mimicked or copied by AI.

The difficulty of attributing a style to an individual author, the lack of harmonisation at EU level, as well as the misapplication of Art. 4 CDSM, have effectively enabled large-scale AI training based on copyrighted works without authorisation or remuneration, creating a regulatory gap that rewards AI companies while eroding creators’ rights and bargaining power.

 “Intellectual property shall be protected”[16]. Are these words destined to remain just a motto?

Takeaway.

Several months after the Studio Ghibli episode, the AI Code of Practice introduced new commitments to address these risks. By adopting the guidelines, signatories commit “to implement appropriate and proportionate technical safeguards to prevent their models from generating outputs that reproduce training content protected by Union law on copyright and related rights in an infringing manner”[17].

Nonetheless, the guidelines remain non-binding, leaving compliance largely at companies’ own discretion - and companies derive great value from the massive ingestion of copyrighted work to train AI models.

When AI models are trained on human-created content, no authorisation or compensation is ensured to the creator, who can rely only on the complex and ambiguous opt-out mechanism. On the output side, AI-generated content often replicates or mimics original works, combining elements from protected materials. Even in this case, no safeguards are guaranteed to rightsholders.

To successfully organise a collective licensing infrastructure, future policy should be shaped around three guiding principles, as proposed by the European Parliament’s Policy IUST Department: transparency, to clarify how copyrighted works are used; fairness, to ensure that rights and revenues are shared with the rightsholders; and enforceability, to clearly designate the EU body that sets and oversees the rules.

Filling the current regulatory gap implies switching from an opt-out model to an opt-in mechanism, which would exclude protected works from AI training unless prior authorisation is granted. This would restore authors’ control over their works by default.

To rebalance the power asymmetry between human creators and large AI developers, as well as to make sure to comply of the principle of appropriate and proportionate remuneration (Art. 18 of the CDMS Directive)[18], equitable remuneration should be mandatory whenever an author’s work is used without prior authorization.

Finally, technical measures and safeguards should be implemented to make creative works traceable and reduce the risk of copyright infringement.

By Priscilla Colaci

[1] OECD (2025), “Intellectual property issues in artificial intelligence trained on scraped data”, OECD Artificial Intelligence Papers, No. 33, OECD Publishing, Paris, https://doi.org/10.1787/d5241a23-en

[2] Creamer, E. (2025, April 3). ‘Meta has stolen books’: authors to protest in London against AI trained using ‘shadow library.’ The Guardian. https://www.theguardian.com/books/2025/apr/03/meta-has-stolen-books-authors-to-protest-in-london-against-ai-trained-using-shadow-library

[3] Team, S. P. (2025, April 8). The SoA’s message to Meta: don’t steal our books. The Society of Authors. https://societyofauthors.org/2025/04/04/the-soas-message-to-meta-dont-steal-our-books/; Team, S. P. (2025a, April 4). UK authors stage protest at Meta HQ against Zuckerberg’s #bookthieves. The Society of Authors. https://societyofauthors.org/2025/04/03/uk-authors-stage-protest-at-meta-hq-against-zuckerbergs-bookthieves/

[4] OECD (2025), “Intellectual property issues in artificial intelligence trained on scraped data”, OECD Artificial Intelligence Papers, No. 33, OECD Publishing, Paris, https://doi.org/10.1787/d5241a23-en

[5] Directive - 2019/790 - EN - dsm - EUR-Lex. (n.d.). https://eur-lex.europa.eu/eli/dir/2019/790/oj

[6] Ibidem.

[7] This aspect has been formally addressed within the AI Code of Practice (Copyright), (n.d.). Shaping Europe’s Digital Future. https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai, where it is specified that “Signatories commit to take appropriate measures to enable affected rightsholders to obtain information about the web crawlers employed, their robots.txt features and other measures that a Signatory adopts to identify and comply with rights reservations expressed pursuant to Article 4(3) of Directive (EU) 2019/790 at the time of crawling by making public such information and by providing a means for affected rightsholders to be automatically notified when such information is updated (such as by syndicating a web feed) without prejudice to the right of information provided for in Article 8 of Directive 2004/48/EC”. Nonethless, the Code is non-binding and is application is not guaranteed.

[8] Policy Department for Justice, Civil Liberties and Institutional Affairs Directorate-General for Citizens’ Rights, Justice and Institutional Affairs PE 774.095. (2025). Generative AI and copyright. Training, creation, regulation. In https://www.europarl.europa.eu/thinktank/en/home.

[9] Gervais, Daniel J. and Shemtov, Noam and Marmanis, Haralambos and Zaller Rowland, Catherine, The Heart of the Matter: Copyright, AI Training, and LLMs (September 21, 2024). Available at SSRN: https://ssrn.com/abstract=4963711 or http://dx.doi.org/10.2139/ssrn.4963711  

[10] Article 53: Obligations for providers of General-Purpose AI models | EU Artificial Intelligence Act. (n.d.). https://artificialintelligenceact.eu/article/53/

[11] Gervais, D., Marmanis, H., Shemtov, N., & Zaller Rowland, C. (N.D.). The Heart Of The Matter: Copyright, Ai Training, And Llms. Ssrn.

[12] Studio Ghibli vs AI: tribute or copyright infringement? (2025, April 15). IP Helpdesk. https://intellectual-property-helpdesk.ec.europa.eu/news-events/news/studio-ghibli-vs-ai-tribute-or-copyright-infringement-2025-04-15_en.

[13] Study on copyright and new technologies – Copyright data management and artificial intelligence, Publications Office of the European Union, 2022, p. 151, https://data.europa.eu/doi/10.2759/570559.

[14] Ibidem, p. 227. The four scenarios investigated are: “Status quo. The moral rights are regulated at Member states’ level. […] Legal clarification that moral rights can be invoked against the processing of the work of performance for AI training. […] Legal clarification that moral rights cannot be invoked against the processing of the work for AI training if TDM exception is allowe. […] Legal clarification that moral rights cannot be invoked against the processing of the work for AI training, if the work or perfomance is not recognisable in the optout.”.

[15] Ibidem, p. 248.

[16] The protection of intellectual property, as explicitly recognised in Art. 17 of the EU Charter of Fundamental Rights, should be ensured throught the application of the guarantees laid down in paragraph (1) of the same articles, which states: “Everyone has the right to own, use, dispose of and bequeath his or her lawfully acquired possessions. No one may be deprived of his or her possessions, except in the public interest and in the cases and under the conditions provided for by law, subject to fair compensation being paid in good time for their loss”. Article 17 - Right to property. (2025, July 03). European Agency for Fundamental Rightshttps://fra.europa.eu/en/eu-charter/article/17-right-property

[17] The General-Purpose AI Code of practice. (n.d.). Shaping Europe’s Digital Future. https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai

[18] Art. 18 - Principle of appropriate and proportionate remuneration (17 April 2019). DIRECTIVE (EU) 2019/790 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC. https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32019L0790. According to Art. 18(1), “Member States shall ensure that where authors and performers license or transfer their exclusive rights for the exploitation of their works or other subject matter, they are entitled to receive appropriate and proportionate remuneration”.