Public AI Launch, and Some Thoughts on Copyright

I attended the exciting launch of a series of papers and reflections on “Public AI” at the EU Parliament this week. The core of the idea is that the non-US/China world needs more public directed and open source AI related resources — from computational capacity to open data sets (like EU’s “data spaces”) — to build both commercial and non-commercial AI tools delinked from big tech.

There is an important copyright issue at its core. To build AI infrastructure, including to support the development of frontier and foundation models that may be themselves non-profit but can serve as the base for other (including commercial) developers, Public AI model builders need legal certainty as to what material they can use for training. If they don’t have the same right as Chinese and US developers, they won’t be able to succeed.

Some developers are working with only openly licensed and public domain sources, but they tend to be trained on much smaller data sets then. Cultural heritage organizations want to help, but they also need certainty as to whether they can curate and share data with model builders. Article 3 of the EU CDSM (2019) provides some cover, but publishers are claiming it is not for training AI but rather only for traditional academic pursuits. Most developing countries lack even an Art. 3 type leg to stand on.

In this context, the future of Public AI appears to depend a lot on the definition of the right to research within modern copyright laws. Proposals to apply remuneration requirements, if any, only after a specific application (“output)”) of a foundation model proves to have copyright relevant effects (e.g. commercial substitution) may be one path forward. See Senftleben, Martin, Generative AI and Author Remuneration (June 14, 2023). International Review of Intellectual Property and Competition Law 54 (2023), pp. 1535-1560.

Share:
Sign up for our newsletter:
Related posts:
Scroll to Top