Artificial Intelligence

Artificial Intelligence

Public AI Launch, and Some Thoughts on Copyright

I attended the exciting launch of a series of papers and reflections on “Public AI” at the EU Parliament this week. The core of the idea is that the non-US/China world needs more public directed and open source AI related resources — from computational capacity to open data sets (like EU’s “data spaces”) — to build both commercial and non-commercial AI tools delinked from big tech. There is an important copyright issue at its core. To build AI infrastructure, including to support the development of frontier and foundation models that may be themselves non-profit but can serve as the base for other (including commercial) developers, Public AI model builders need legal certainty as to what material they can use for training. If they don’t have the same right as Chinese and US developers, they won’t be able to succeed. Some developers are working with only openly licensed and public domain sources, but they tend to be trained on much smaller data sets then. Cultural heritage organizations want to help, but they also need certainty as to whether they can curate and share data with model builders. Article 3 of the EU CDSM (2019) provides some cover, but publishers are claiming it is not for training AI but rather only for traditional academic pursuits. Most developing countries lack even an Art. 3 type leg to stand on. In this context, the future of Public AI appears to depend a lot on the definition of the right to research within modern copyright laws. Proposals to apply remuneration requirements, if any, only after a specific application (“output)”) of a foundation model proves to have copyright relevant effects (e.g. commercial substitution) may be one path forward. See Senftleben, Martin, Generative AI and Author Remuneration (June 14, 2023). International Review of Intellectual Property and Competition Law 54 (2023), pp. 1535-1560.

Artificial Intelligence, Blog, Centre News

Centre Announces Short Course on Intellectual Property and Artificial Intelligence

The Centre on Knowledge Governance is pleased to announce a new short course on AI and IP to take place in Geneva from September 7-8, 2026. COURSE DESCRIPTION  This intensive two-day course provides a comprehensive, comparative analysis of the evolving legal and policy landscape at the intersection of Intellectual Property (IP) and Artificial Intelligence (AI). Participants will explore pressing legal challenges, including the copyright protection for AI training data, the patentability and copyright of AI-generated outputs, and the balance between proprietary interests and the public interest in research (Text and Data Mining and computational research) and the development of “Public AI.”  The course will feature in-depth comparative analysis of legal frameworks and policy proposals across the European Union (EU), United States (USA), India, Brazil, Singapore, Japan, and in international forums, such as the World Intellectual Property Organization, World Trade Organization and other agencies.  The learning experience will culminate in a practical role-play exercise in which students will draft a model international legal instrument aimed at ensuring fair remuneration for creators while safeguarding the rights of researchers and public interest organizations developing AI infrastructure. This legal instrument will focus on  a range of factors to be used in distinguishing research and public interest uses of AI from commercial competitive uses. LEARNING OBJECTIVES Upon completion of this course, participants will be able to: WHO IS THIS PROGRAMME FOR? This programme is particularly relevant for mid- to senior level practitioners from various organisations working at the intersection of intellectual property and AI policy or scholarship, such as: LECTURERS The Course will be directed by Sean Flynn and Ben Cashdan of the Centre on Knowledge Governance, Geneva Graduate Institute. Guest lecturers will participate in person or online to bring comparative expertise from jurisdictions such as India, Brazil and China and the African continent, in addition to the US and EU. SCHOLARSHIPS 10 scholarships will be available for highly motivated government delegates from developing countries or representatives of public interest organizations who participate in multilateral policy processes on copyright, AI and the rights of researchers. EXPRESSION OF INTEREST (INITIAL APPLICATION) If you are interested in being considered as a student on the course, and/or if you would like to apply for one of our scholarships, please kindly complete the following form:

Africa: Copyright & Public Interest, Artificial Intelligence, TDM Cases

Case Studies of AI for Good and AI for Development

Today the Geneva Centre on Knowledge Governance presents a series of Case Studies on AI for Good in Africa and the Global South. These grew out of our work on Text and Data Mining and our policy work in support of the Right to Research. Researchers in the Global South are responding to local and global challenges from health and education to language preservation and mitigation of climate change. In all these case computational methods and Artificial Intelligence (AI) play a leading role in finding and implementing solutions. A common thread that runs through all the cases is how intellectual property laws can support innovation and problem solving in the public interest, whilst protecting the interests of creators, communities and custodians of traditional knowledge. In addition several practitioners are looking at how to redress data imbalances, where large companies in the Global North have much greater access to works, for historical, legal and economic reasons. The cases include: Each of our case studies in written up in the form of a report, combined with a video exploration of the case study in the words of its leading practitioners.

Artificial Intelligence, Blog, Latin America / GRULAC

INTELIGENCIA ARTIFICIAL, DERECHOS DE AUTOR Y EL FUTURO DE LA CREATIVIDAD: APUNTES DE LA FERIA INTERNACIONAL DEL LIBRO DE PANAMÁ

Por Andrés Izquierdo Durante la segunda semana de agosto, fui invitado a hablar en la Feria Internacional del Libro de Panamá, un evento organizado por la la Oficina del Derecho de Autor de Panamá, el Ministerio de Cultura y la Asociación Panameña de Editores con apoyo de la Organización Mundial de la Propiedad Intelectual (OMPI). Mi presentación se centró en la cada vez más compleja intersección entre las leyes de derechos de autor y la inteligencia artificial (IA), un tema ahora en el centro del debate legal, cultural y económico mundial. Esta publicación resume los argumentos principales de esa presentación, basándose en litigios recientes, investigaciones académicas y desarrollos de políticas, incluyendo el informe de mayo de 2025 de la Oficina de Derechos de Autor de EE. UU. sobre IA generativa. ¿Cómo deberían responder las leyes de derechos de autor al uso generalizado de obras protegidas en el entrenamiento de sistemas de IA generativa? El análisis sugiere que hay debates emergentes en varias áreas clave: los límites del uso justo y las excepciones, la necesidad de derechos de remuneración aplicables, y el papel de la concesión de licencias y la supervisión regulatoria. El artículo se desarrolla en cinco partes: comienza con una visión general del contexto legal y tecnológico en torno al entrenamiento de IA; luego revisa propuestas académicas para recalibrar los marcos de derechos de autor; examina decisiones judiciales recientes que ponen a prueba los límites de la doctrina actual; resume el informe de 2025 de la Oficina de Derechos de Autor de EE. UU. como respuesta institucional; y concluye con cuatro consideraciones de política para la regulación futura. UN ESCENARIO LEGAL Y TECNOLÓGICO EN TRANSFORMACIÓNLa integración de la IA generativa en los ecosistemas creativos e informativos ha expuesto tensiones fundamentales en la ley de derechos de autor. Los sistemas actuales ingieren rutinariamente grandes volúmenes de obras protegidas —como libros, música, imágenes y periodismo— para entrenar modelos de IA. Esta práctica ha dado lugar a preguntas legales no resueltas: ¿Puede la ley de derechos de autor regular de manera significativa el uso de datos de entrenamiento? ¿Se extienden las doctrinas y disposiciones legales existentes—como el uso justo, o excepciones y limitaciones—a estas prácticas? ¿Qué remedios, si los hay, están disponibles para los titulares de derechos cuyas obras se utilizan sin consentimiento? Estas preguntas siguen abiertas en todas las jurisdicciones. Si bien algunos tribunales y agencias reguladoras han comenzado a responder, una parte sustancial del debate está siendo moldeada ahora por la investigación académica  jurídica y por los litigios, cada uno proponiendo marcos para conciliar el desarrollo de la IA con los compromisos normativos del derecho de autor. Las siguientes secciones examinan este panorama evolutivo, comenzando con propuestas académicas recientes. PERSPECTIVAS ACADÉMICAS: HACIA UN EQUILIBRIO RENOVADOAl revisar la literatura académica, han emergido varios temas claros. Primero, algunos autores concuerdan en que deben fortalecerse los derechos de remuneración para los autores. Geiger, Scalzini y Bossi sostienen que, para garantizar verdaderamente una compensación justa para los creadores en la era digital, especialmente a la luz de la IA generativa, la ley de derechos de autor de la Unión Europea debe ir más allá de las débiles protecciones contractuales y, en su lugar, implementar derechos de remuneración robustos e inalienables que garanticen ingresos directos y equitativos a autores e intérpretes como cuestión de derechos fundamentales. Segundo, varios académicos subrayan que la opacidad técnica de la IA generativa exige nuevos enfoques de remuneración para los autores. Cooper argumenta que, a medida que los sistemas de IA evolucionen, será casi imposible determinar si una obra fue generada por IA o si una obra protegida específica se utilizó en el entrenamiento. Advierte que esta pérdida de trazabilidad hace que los modelos de compensación basados en atribución sean inviables. En cambio, aboga por marcos alternativos para garantizar que los creadores reciban una compensación justa en una era de autoría algorítmica. Tercero, académicos como Pasquale y Sun sostienen que los responsables de formular políticas deberían adoptar un sistema dual de consentimiento y compensación: otorgar a los creadores el derecho a excluirse del entrenamiento de IA y establecer un gravamen sobre los proveedores de IA para asegurar el pago justo a aquellos cuyas obras se utilizan sin licencia. Gervais, por su parte, defiende que los creadores deberían recibir un nuevo derecho de remuneración, asignable, por el uso comercial de sistemas de IA generativa entrenados con sus obras protegidas por derechos de autor; este derecho complementaría, pero no reemplazaría, los derechos existentes relacionados con reproducción y adaptación. También hay un consenso creciente sobre la necesidad de modernizar las limitaciones y excepciones, en particular para educación e investigación. Flynn et al. muestran que una mayoría de los países del mundo no tienen excepciones que permitan la investigación y enseñanza modernas, como el uso académico de plataformas de enseñanza en línea. Y en Science, varios autores proponen armonizar las excepciones de derechos de autor internacionales y domésticas para autorizar explícitamente la minería de texto y datos (TDM) para investigación, permitiendo el acceso lícito y transfronterizo a materiales protegidos sin requerir licencias previas. En la OMPI, el Comité Permanente sobre Derecho de Autor y Derechos Conexos (SCCR) ha tomado medidas en este ámbito aprobando un programa de trabajo sobre limitaciones y excepciones, actualmente en discusión para el próximo SCCR 47. Y en el Comité de Desarrollo y Propiedad Intelectual (CDIP), está aprobado un Proyecto Piloto sobre TDM para Apoyar la Investigación e Innovación en Universidades y Otras Instituciones Orientadas a la Investigación en África – Propuesta del Grupo Africano (CDIP/30/9 REV). Mi propio trabajo, al igual que el de Díaz & Martínez, ha enfatizado la urgencia de actualizar las excepciones educativas latinoamericanas para dar cuenta de usos digitales y transfronterizos. Eleonora Rosati sostiene que el entrenamiento con IA no licenciada queda fuera de las excepciones de derechos de autor existentes en la UE y el Reino Unido, incluidas el Artículo 3 (TDM para investigación científica) de la Directiva DSM, el Artículo 4 (TDM general con exclusiones) y el Artículo 5(3)(a) de la Directiva InfoSoc (uso para enseñanza o investigación

Artificial Intelligence, Blog, Latin America / GRULAC

AI, Copyright, and the Future of Creativity: Notes from the Panama International Book Fair

AI, Copyright, and the Future of Creativity: Notes from the Panama International Book FairDuring the second week of August, I was invited to speak at the Panama International Book Fair, an event hosted by the World Intellectual Property Organization (WIPO), the Panama Copyright Office, the Ministry of Culture, and the Panama Publishers Association. My presentation focused on the increasingly complex intersection between copyright law and artificial intelligence (AI)—a topic now at the center of global legal, cultural, and economic debate. This post summarizes the core arguments of that presentation, drawing on recent litigation, academic research, and policy developments, including the U.S. Copyright Office’s May 2025 report on generative AI. How should copyright law respond to the widespread use of protected works in the training of generative AI systems? The analysis suggests there are emerging discussions around several key areas: the limits of fair use and exceptions, the need for enforceable remuneration rights, and the role of licensing and regulatory oversight. The article proceeds in five parts: it begins with an overview of the legal and technological context surrounding AI training; it then reviews academic proposals for recalibrating copyright frameworks; it examines recent court decisions that test the boundaries of current doctrine; it summarizes the U.S. Copyright Office’s 2025 report as an institutional response; and it concludes by outlining four policy considerations for future regulation. A Shifting Legal and Technological LandscapeThe integration of generative AI into creative and informational ecosystems has exposed foundational tensions in copyright law. Current systems routinely ingest large volumes of copyrighted works—such as books, music, images, and journalism—to train AI models. This practice has given rise to unresolved legal questions: Can copyright law meaningfully regulate the use of training data? Do existing doctrines and legal provisions—fair use, or exceptions and limitations—extend to these practices? What remedies, if any, are available to rightsholders whose works are used without consent? These questions remain open across jurisdictions. While some courts and regulatory agencies have begun to respond, a substantial part of the debate is now being shaped by legal scholarship and litigation, each proposing frameworks to reconcile AI development with copyright’s normative commitments. The following sections examine this evolving landscape, beginning with recent academic proposals. Academic Perspectives: Towards a New Equilibrium In reviewing the literature, several clear themes have emerged. First, some authors agree that remuneration rights for authors must be strengthened. Geiger, Scalzini, and Bossi argue that to truly ensure fair compensation for creators in the digital age, especially in light of generative AI, EU copyright law must move beyond weak contractual protections and instead implement strong, unwaivable remuneration rights that guarantee direct and equitable revenue flows to authors and performers as a matter of fundamental rights. Second, some scholars highlight that the technical opacity of generative AI demands new approaches to author remuneration. Cooper argues that as AI systems evolve, it will become nearly impossible to determine whether a work was AI-generated or whether a particular copyrighted work was used in training. He warns that this loss of traceability renders attribution-based compensation models unworkable. Instead, he calls for alternative frameworksto ensure creators are fairly compensated in an age of algorithmic authorship. Third, scholars like Pasquale and Sun argue that policymakers should adopt a dual system of consent and compensation—giving creators the right to opt out of AI training and establishing a levy on AI providers to ensure fair payment to those whose works are used without a license. Gervais, meanwhile, argues that creators should be granted a new, assignable right of remuneration for the commercial use of generative AI systems trained on their copyrighted works—complementing, but not replacing, existing rights related to reproduction and adaptation. There is also a growing consensus on the need to modernize limitations and exceptions, particularly for education and research. Flynn et al. show that a majority of the countries in the world do not have exceptions that enable modern research and teaching, such as academic uses of online teaching platforms. And in Science, several authors propose harmonizing international and domestic copyright exceptions to explicitly authorize text and data mining (TDM) for research, enabling lawful, cross-border access to copyrighted materials without requiring prior licensing.  At WIPO, the Standing Committee on Copyright and Related Rights (SCCR) has been taking steps in this area by approving a work program on L&E´s, under current discussions for the upcoming SCCR 47. And in the Committee on Development and Intellectual Property (CDIP), there is a Pilot Project approved on TDM to Support Research and Innovation in Universities and Other Research-Oriented Institutions in Africa – Proposal by the African Group (CDIP/30/9 REV). My own work, as well as that of Díaz & Martínez, has emphasized the urgency of updating Latin American educational exceptions to account for digital and cross-border uses.  Eleonora Rosati argues that unlicensed AI training falls outside existing EU and UK copyright exceptions, including Article 3 of the DSM Directive (TDM for scientific research), Article 4 (general TDM with opt-outs), and Article 5(3)(a) of the InfoSoc Directive (use for teaching or scientific research). She finds that exceptions for research, education, or fair use-style defenses do not apply to the full scope of AI training activities. As a result, she concludes that a licensing framework is legally necessary and ultimately unavoidable, even when training is carried out for non-commercial or educational purposes. Finally, policy experts like James Love warn that “one-size-fits-all” regulation risks sidelining the medical and research breakthroughs promised by artificial intelligence. The danger lies in treating all training data as equivalent—conflating pop songs with protein sequences, or movie scripts with clinical trial data. Legislation that imposes blanket consent or licensing obligations, without distinguishing between commercial entertainment and publicly funded scientific knowledge, risks chilling socially valuable uses of AI. Intellectual property law for AI must be smartly differentiated, not simplistically uniform. Litigation as a Site of Doctrinal Testing U.S. courts have become a key venue for testing the boundaries of copyright in the age of artificial intelligence. In the past two years, a growing number of cases

Artificial Intelligence, Blog

A first look into the JURI draft report on copyright and AI

This post was originally published on COMMUNIA by Teresa Nobre and Leander Nielbock Last week we saw the first draft of the long-anticipated own-initiative report on copyright and generative artificial intelligence authored by Axel Voss for the JURI Committee (download as a PDF file). The report, which marks the third entry of the Committee’s recent push on the topic after a workshop and the release of a study in June, fits in with the ongoing discussions around Copyright and AI at the EU-level. In his draft, MEP Voss targets the legal uncertainty and perceived unfairness around the use of protected works and other subject matter for the training of generative AI systems, strongly encouraging the Commission to address the issue as soon as possible, instead of waiting for the looming review of the Copyright Directive in 2026. A good starting point for creators The draft report starts by calling the Commission to assess whether the existing EU copyright framework addresses the competitive effects associated with the use of protected works for AI training, particularly the effects of AI-generated outputs that mimic human creativity. The rapporteur recommends that such assessment shall consider fair remuneration mechanisms (paragraph 2) and that, in the meantime, the Commission shall “immediately impose a remuneration obligation on providers of general-purpose AI models and systems in respect of the novel use of content protected by copyright” (paragraph 4). Such an obligation shall be in effect “until the reforms envisaged in this report are enacted.” However, we fail to understand how such a transitory measure could be introduced without a reform of its own. Voss’s thoughts on fair remuneration also require further elaboration, but clearly the rapporteur is solely concerned about remunerating individual creators and other rightholders (paragraph 2). Considering, however, the vast amounts of public resources that are being appropriated by AI companies for the development of AI systems, remuneration mechanisms need to channel value back to the entire information ecosystem. Expanding this recommendation beyond the narrow category of rightholders seems therefore crucial. Paragraph 10 deals with the much debated issue of transparency, calling for “full, actionable transparency and source documentation by providers and deployers of general-purpose AI models and systems”, while paragraph 11 asks for an “irrebuttable presumption of use” where the full transparency obligations have not been fully complied with. Recitals O to Q clarify that full transparency shall consist “in an itemised list identifying each copyright-protected content used for training”—an approach that does not seem proportionate, realistic or practical. At this stage, a more useful approach to copyright transparency would be to go beyond the disclosure of training data, which is already dealt with in the AI Act, and recommend the introduction of public disclosure commitments on opt-out compliance. A presumption of use—which is a reasonable demand—could still kick in based on a different set of indicators. Another set of recommendations that aims at addressing the grievances of creators are found on paragraphs 6 and 9 and include the standardization of opt-outs and the creation of a centralized register for opt-outs. These measures are very much in line with COMMUNIA’s efforts to uphold the current legal framework for AI training, which relies on creators being able to exercise and enforce their opt-out rights. Two points of concern for users At the same time that it tries to uphold the current legal framework, the draft report also calls for either the introduction of a new “dedicated exception to the exclusive rights to reproduction and extraction” or for expanding the scope of Article 4 of the DSM Directive “to explicitly encompass the training of GenAI” (paragraph 7). At first glance, this recommendation may appear innocuous—redundant even, given that the AI Act already assumes that such legal provision extends to AI model providers. However, the draft report does not simply intend to clarify the current EU legal framework. On the contrary, the report claims that the training of generative AI systems is “currently not covered” by the existing TDM exceptions. This challenges the interpretation provided for in the AI Act and by multiple statements by the Commission and opens the door for discussions around the legality of current training practices, with all the consequences this entails, including for scientific research. The second point of concern for users is paragraph 13, which calls for measures to counter copyright infringement “through the production of GenAI outputs.” Throughout the stakeholder consultations on the EU AI Code of Practice, COMMUNIA was very vocal about the risks this category of measures could entail for private uses, protected speech and other fundamental freedoms. We strongly opposed the introduction of system-level measures to block output similarity, since those would effectively require the use of output filters without safeguarding users rights. We also highlighted that model-level measures targeting copyright-related overfitting could have the effect of preventing the lawful development of models supporting substantial legitimate uses of protected works. As this report evolves, it is crucial to keep this in mind and to ensure that any copyright compliance measures targeting AI outputs are accompanied by relevant safeguards that protect the rights of users of AI systems. A win for the Public Domain One of the last recommendations in the draft report concerns the legal status of AI-generated outputs. Paragraph 12 suggests that “AI-generated content should remain ineligible for copyright protection, and that the public domain status of such works be clearly determined.” While some AI-assisted expressions can qualify as copyright-protected works under EU law —most importantly when there’s sufficient human control over the output—many will not meet the standards for copyright protection. However, these outputs can still potentially be protected by related rights, since most have no threshold for protection. This calls into question whether the related rights system is fit for purpose in the age of AI: protecting non-original AI outputs with exclusive rights regardless of any underlying creative activity and in the absence of meaningful investment is certainly inadequate. We therefore support the recommendation that their public domain status be asserted in those cases. Next steps Once the draft report is officially published and presented in JURI on

Artificial Intelligence, Blog

Danish Bill Proposes Using Copyright Law to Combat Deepfakes

Recently, a Danish Bill has been making headlines by addressing issues related to deepfake through a rather uncommon approach: copyright. As stated to The Guardian, the Danish Minister of Culture, Jakob Engel-Schmidt, explained that they “are sending an unequivocal message that everybody has the right to their own body, their own voice and their own facial features, which is apparently not how the current law is protecting people against generative AI.” According to CNN, the minister believes that the “proposed law would help protect artists, public figures, and ordinary people from digital identity theft.” Items 8, 10, and 19 of the proposal include some of the most substantive changes to the law. Among other measures, Item 8 proposes adding a new § 65(a), requiring the prior consent of performers and performing artists to digitally generate imitations of them and make these available to the public, establishing protection for a term of 50 years after their death. Item 10 introduces a new § 73(a), focusing on “realistic digitally generated imitations of a natural person’s personal, physical characteristics,” requiring prior consent from the person being imitated before such imitations can be made available to the public. This exclusive right would also last for 50 years after the death of the imitated person and would not apply to uses such as caricature, satire, parody, pastiche, criticism, or similar purposes. It could be argued that this approach is uncommon because several countries, including those in the European Union, already have laws regulating personality rights and, more specifically, personal data. Copyright is known for regulating the use of creative expressions of the human mind, not the image, voice, or likeness of a person when considered individually, i.e., outside the context of an artistic performance. According to CNN “Engel-Schmidt says he has secured cross-party support for the bill, and he believes it will be passed this fall.”  A machine-translated version of the Proposal is below:  Notes:

Artificial Intelligence, Blog

Latest Developments on Training GenAI with Copyrighted Works and Some 'What Ifs?'

‘Boring’ is not a word that can be used to describe the past few days for those interested in litigation involving copyright issues in the development and use of Generative AI systems. Two major cases saw significant updates, issuing orders that addressed one of the main questions raised in these lawsuits: is the use of copyrighted materials to train Generative AI systems fair use? This blog post aims to briefly describe each case’s key points related to fair use and to highlight what was left unresolved, including all the ‘what if’ scenarios that were hinted at but not decided upon Bartz, Graeber & Johnson v. Anthropic Judge William Alsup’s order on fair use addressed not only the different copies of copyrighted material made for training generative AI systems but also uses related to Anthropic’s practice of keeping copies as a “permanent, general-purpose resource”. It also distinguished between legally purchased copies and millions of pirated copies retained by Anthropic, applying a different fair use analysis to each category. Regarding the overall analysis of fair use for copyrighted works used to train Anthropic’s Generative AI system, Judge Alsup found that the use “was exceedingly transformative and was a fair use.” Among the four factors, only the second factor weighed against using copyrighted works to train the GenAI system. Concerning the digitization of legally purchased books, it was also considered fair use not because of the purpose of training AI systems, but for a much simpler reason:  “because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies”. For this specific use, of the four factors, only factor two weighed against fair use, while factor four remained neutral. On the other hand, Judge Alsup clearly stated that using pirated copies to create the “general-purpose library” was not fair use, even if some copies might be used to train LLMs. All factors weighed against it. Specifically, Judge Alsup noted: “it denies summary judgment for Anthropic that the pirated library copies must be treated as training copies. We will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages, actual or statutory (including for willfulness).” Kadrey v. Meta At the very beginning of the order, Judge Vince Chhabria clarified that the case questions whether using copyrighted material to train generative AI models without permission or remuneration is illegal and affirmed that: “although the devil is in the details, in most cases the answer will likely be yes. What copyright law cares about, above all else, is preserving the incentive for human beings to create artistic and scientific works. Therefore, it is generally illegal to copy protected works without permission. And the doctrine of “fair use,” which provides a defense to certain claims of copyright infringement, typically doesn’t apply to copying that will significantly diminish the ability of copyright holders to make money from their works (thus significantly diminishing the incentive to create in the future).” Judge Chhabria explained further that  “by training generative AI models with copyrighted works, companies are creating something that often will dramatically undermine the market for those works, and thus dramatically undermine the incentive for human beings to create things the old-fashioned way.” According to him, this would primarily affect not classic works or renowned authors but rather the market for the “typical human-created romance or spy novel,” which could be substantially diminished by similar AI-created works.  However, all these points were framed as “this Court’s general understanding of generative AI models and their capabilities”, with Judge Chhabria emphasizing that “Courts can’t decide cases based on general understandings. They must decide cases based on the evidence presented by the parties.”  Despite this general understanding that “copying the protected works, however transformative, involves the creation of a product with the ability to severely harm the market for the works being copied, and thus severely undermine the incentive for human beings to create“, Judge Chhabria found two of the plaintiffs’ three market harm theories “clear losers,” and the third, a “potentially winning” argument, underdeveloped: “First, the plaintiff might claim that the model will regurgitate their works (or outputs that are substantially similar), thereby allowing users to access those works or substitutes for them for free via the model. Second, the plaintiff might point to the market for licensing their works for AI training and contend that unauthorized copying for training harms that market (or precludes the development of that market). Third, the plaintiff might argue that, even if the model can’t regurgitate their own works or generate substantially similar ones, it can generate works that are similar enough (in subject matter or genre) that they will compete with the originals and thereby indirectly substitute for them. In this case, the first two arguments fail. The third argument is far more promising, but the plaintiffs’ presentation is so weak that it does not move the needle, or even raise a dispute of fact sufficient to defeat summary judgment.“ In the overall analysis of the four factors, only the second factor weighed against Meta. Summary judgment was granted to Meta regarding the claim of copyright infringement from using plaintiffs’ books for AI training. Nevertheless, Judge Chhabria clarified that “this ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful. It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one.” The use of pirated copies was also addressed in Kadrey v. Meta. In this case, “there is no dispute that Meta torrented LibGen and Anna’s Archive […].” According to Judge Chhabria, while downloading from shadow libraries wouldn’t automatically win the plaintiffs’ case, it was relevant for the fair use analysis, especially regarding “bad faith” and whether the downloads benefited or perpetuated unlawful activities. Lessons

Artificial Intelligence, Blog, Case Studies, TDM Cases

Promoting AI for Good in the Global South – Highlights

Across Africa and Latin America, researchers are using Artificial Intelligence to solve pressing problems: from addressing health challenges and increasing access to information for underserved communities, to preserving languages and culture. This wave of “AI for Good” in the Global South faces a major difficulty: how to access good quality training data, which is scarce in the region and often subject to copyright restrictions. The most prominent AI companies are in the Global North and increasingly in China. These companies generally operate in jurisdictions with more permissive copyright exceptions, which enable Text and Data Mining (TDM), often the first step in training AI language models. The scale of data extraction and exploitation by a handful of AI mega-corporations has raised two pressing concerns: What about researchers and developers in the Global South and what about the creators and communities whose data is being used to train the AI models? Ethical AI: An Opportunity for the Global South? At a side event in April at WIPO, we showcased some models of ‘ethical AI’ aimed at: The event took place in Geneva in April 2025. This week we released a 15 minute highlights video. Training data and copyright issues At the start of the event, we cited two Text and Data Mining projects in Africa which have had difficulty in accessing training data due to copyright. The first was the Masakhane Project in Kenya, which used translations of the bible to develop Natural Language Processing tools in African languages. The second was the Data Sciences for Social Impact group at the University of Pretoria in South Africa who want to develop a health chatbot using broadcast TV shows as the training data. Data Farming, The NOODL license, Copyright Reform The following speakers then presented cutting edge work on how to solve copyright and other legal and ethical challenges facing public interest AI in Africa: The AI Act in Brazil: Remunerating Creators Carolina Miranda of the Ministry of Culture in Brazil indicated that her government is focused on passing a new law to ensure that those creators in Brazil whose work is used to train AI models are properly remunerated. Ms Miranda described how Big Tech in the Global North fails to properly pay creators in Brazil and elsewhere for the exploitation of their work. She confirmed that discussions of the AI Act are still ongoing and that non profit scientific research will be exempt from the remuneration provision. Jamie Love of Knowledge Ecology International suggested that to avoid the tendency of data providers to build a moat around their datasets, a useful model is the Common European Data Spaces being established by the European Commission. Four factors to Evaluate AI for Good At the end of the event we put forward the following four discriminating factors which might be used to evaluate to what extent copyright exceptions and limitations should allow developers and researchers to use training data in their applications: The panel was convened by the Via Libre Foundation in Argentina and ReCreate South Africa with support from the Program on Information Justice and Intellectual Property (PIJIP) at American University, and support from the Arcadia Fund. We are currently researching case studies on Text and Data Mining (TDM) and AI for Good in Africa and the Global South. Ben Cashdan is an economist and TV producer in Johannesburg and the Executive Director of Black Stripe Foundation. He also co-founded ReCreate South Africa.

Artificial Intelligence, Blog

The Great Flip: Can Opt-Outs be a Permitted Exception? Part II

By Lokesh Vyas and Yogesh Badwal. This post was originally published on Spicy IP. In the previous part, we examined whether the opt-out mechanism, as claimed in Gen-AI litigations, constitutes a prohibited formality for the “enjoyment and exercise” of authors’ rights under Article 5(2) of the Berne Convention. And we argued no. In this post, we address the second question: Can opting out be permitted as an exception under the three-step test outlined in Article 9(2)? If you haven’t seen the previous post, some context is helpful. (Or, you can skip this part) As we mentioned in the last post, “Many generative AI models are trained on vast datasets (which can also be copyrighted works) scraped from the internet, often without the explicit consent of content creators, raising legal, ethical, and normative questions. To address this, some AI developers have created and claimed “opt-out mechanisms,” allowing copyright holders or creators to ask that their works not be used in training (e.g., OpenAI’s Policy FAQs).  Opt out under the Copyright Exception A  question arises here: What are the other ways opt-out mechanisms can be justified if the states want to make a mechanism like that? One may say that opt-outs can be valid under the Berne Convention if an exception (e.g., an AI training exception with an inbuilt opt-out possibility) passes the three-step test. And this way, opt-outs can be regarded as a legitimate limit on holders’ exclusive rights. For reference, the three-step test was created in the 1967 revision conference, later followed in Article 13 of TRIPS and Article 10 of WCT. The test creates a room for the nations to make certain exceptions and limitations. Article 9(2) authorises the member countries “to permit the reproduction” of copyright works in 1.) “certain special cases, provided that such reproduction 2.) does not conflict with a normal exploitation of the work and 3.) does not unreasonably prejudice the legitimate interests of the author”.  Although we don’t delve into the test, how opting out can be a part of an exception can be understood from an example. For instance, as Ginsburg exemplifies, if a country states that authors lose their translation rights unless they explicitly reserve or opt out of them, it would violate Article 5(2) because such rights under Berne must apply automatically, without formalities. This actually happened with Turkey in 1931, whose application for membership was rejected due to the condition of deposit for translation rights in its domestic law. (See Ricketson and Ginsburg’s commentary, paragraph 17.18.)  But if an exception (like allowing radio retransmissions in bars) already complies with Berne’s provisions and applies equally to all authors, then letting authors opt out of that exception would give them more rights than Berne requires. And this should be permissible.  Notably, introducing an exception, such as for AI training, must first pass the three-step test. Opt out can be built therein. However, remember that every exception presupposes a prima facie infringement. Within that frame, the opt-out offers the author a chance not to lose. Thus, it creates an inadvertent expansion of her rights beyond the convention.  Additionally, opt-out can fare well with the three-step test due to the factor of “equitable remuneration to authors.” As Gompel notes in his piece, “…‘opt out’ eases compliance with the three-step test because it mitigates some of the adverse effects of the proposed copyright exception. That is, it enables authors to retain exclusivity by opting out of the compensation scheme.”  Another question also exists: Did Berne contain particular provisions that directly allowed an opt-out arrangement? Well, the answer is Yes. Does opting out equal the right to reserve under Article 10bis? Not really. Setting aside the debate over formality and the three-step test, the Berne Convention contains an opt-out-style provision, albeit limited, where authors must explicitly reserve their rights to avoid specific uses of their work. Relevant here is Article 10bis of the Convention, which allows member countries to create exceptions for the reproduction of works published in newspapers on, among other topics, current economic, political, or religious issues. However, it also allows the authors to ‘expressly reserve’ their work from reproduction. Indian Copyright Act, 1957 also contains a similar provision in Section 52(1)(m). Interestingly, the right to reserve exploitation has been part of the Berne Convention since its earliest draft. It first appeared in Article 7 alongside the provision on formalities, which was numbered Article 2 in the draft. Article 7 became Article 9(2) in 1908, when formalities were prohibited and the no-formality rule entered the Berne Convention.  This historical pairing raises a strong presumption: opting out of a specific mode of exploitation cannot automatically be deemed a prohibited formality. Ginsburg confirms this, citing the 1908 Berlin Conference, which clarified that the reservation/opt-out clause (then Article 9(2)) was not considered a formality. But can this special setting (created in Article 10bis(1)) be used to open the door for general opt-out AI exception measures by countries? We doubt it. As the negotiation history of the 1967 revision conference suggests, Article 10bis(1) is a lex specialis, i.e., a narrow and specific exception (See page 1134 of Negotiations, Vol. II). This means that it may derogate from the general no-formalities rule, but it cannot serve as a model for broader declaratory measures.  Conclusion The upshot is that opt-outs may be de facto formalities. However, not all formalities are prohibited under the Berne Convention. The convention enables countries to make some formalities on “the extent of protection.” Three key points emerge from this discussion: One, opting out may not be a formality that prevents the enjoyment and exercise of rights, as Gompel and Sentfeln confirm, and Ginsburg argues otherwise. Two, it can be a part of an AI training exception if such an exception can pass the three-step test. When applying this test, opting out would support the factor of equitable remuneration. Three, Article 10(bis) on the right to reserve cannot be read expansively. While it can be used to justify the three-step test as Sentfleben does, it might not be extended generally. Okay. That’s it from our end. À bientôt’ Primary Sources:-

Scroll to Top