
Abstract
The propagation and evolution of AI-powered fact-checking tools worldwide has foregrounded the issue of access to quality training data. While many of the most widely adopted systems have originated in the Global North, there are also notable and growing efforts in the Global South to develop and adapt fact-checking technologies. As AI tools develop in new regions, they raise important questions about the differential impact of factors such as copyright on training data access, pointing to persistent obstacles and areas needing further research.
1. Introduction
The spread of AI-powered fact-checking tools marks a significant shift in how misinformation is detected and addressed, transforming practices in journalism and civil society across continents. Systems that began in research institutions and newsrooms in the Global North are now being adapted and implemented in local contexts—including Africa, Latin America, and the Middle East. Although some widely used technologies originated in the North, emerging initiatives in the Global South are beginning to shape the landscape through locally designed and customized tools.
This evolution brings renewed attention to longstanding questions: Which factors shape obstacles or biases within training data? Do factors like resources constraints or copyright laws shape who can access, adapt, and benefit from these AI fact checking systems? As adoption extends to newer regions, the risk of unequal impact increases —not only due to linguistic and technical barriers, but also because of differences in copyright policies and regulatory environments.
When it comes to copyright, there is a need for further research, especially as global partnerships and technology transfer intensify.
2. The Evolution and Impact of North-South AI Fact-Checking Partnerships
Recent years have seen the emergence of global collaborations for AI-powered fact-checking bridging expertise between the Global North and South. One example began with the development and piloting of AI technologies by Full Fact—a UK-based fact-checking organization—and was then extended through partnerships with organisations such as Africa Check (Africa), Chequeado (Latin America), and the Arab Fact-Checking Network (Middle East). While most early implementation focused on English-language media monitoring and elections in developed countries, philanthropic support and local adaptations have greatly expanded reach and relevance.
Scale and Spread:
- AI technologies, first developed in the UK, have since been adapted for and integrated with regional fact-checking teams, through grant funding, collaborative research, and multilingual upgrades.
- These partnership tools now support over 40 organizations in 30+ countries and 3 languages, with active work underway to extend coverage to many other languages, including major African languages.
3. Concrete Examples of AI Fact-Checking through Partnerships
Nigeria (Africa Check)
During Nigeria’s 2023 elections, Africa Check deployed AI-powered claim detection and transcription systems (developed in partnership with Full Fact) to monitor over 40,000 daily “fact-checkable claims” from more than 80 media sources. These claims were algorithmically screened for verification potential; a subset were selected for intensive human review and public debunking. This enabled fact-checkers to respond at unprecedented scale to viral misinformation and political rumors, with documented improvements in election monitoring and rapid rebuttal.
Argentina (Chequeado)
In Latin America, Chequeado integrated the partnership’s core AI technologies—adapting them into their “Chequeabot” system for Spanish-language fact-checking. Used during major televised debates and news events, Chequeabot supplied journalists with real-time claim detection and prioritization, flagging misleading or controversial statements for timely investigation and reporting.
Middle East (Arab Fact-Checking Network)
The Arab Fact-Checking Network and affiliated organizations have recently adopted the same AI tools for live tracking and countering misinformation in Arabic-language media. During national elections and crises, these technologies enabled rapid, high-volume monitoring of broadcast and online claims, supporting collaborative efforts to uphold media integrity and provide accurate information under pressure.
4. The Impact of Copyright Restrictions on Training Data Quality
A critical challenge for AI-powered fact-checking is that legal and financial constraints, including copyright restrictions, can limit access to high-quality, rigorously verified journalism. Leading news sources may be behind paywalls or subject to licensing requirements, making them difficult or costly to include in AI training sets. By contrast, low-quality sensationalised content can be more easily available, promoted by online algorithms.
This discrepancy between access to low quality vs. verified information sources partly explains the increased need for AI fact checking. But it also presents the danger that automated fact checking tools may be skewed though their training data toward unreliable sources. Without intentional curation and targeted copyright exceptions and/or ethical licensing, automated systems risk amplifying lower-quality information or missing key local knowledge.
Projects must prioritize source verification, data provenance transparency, and partnerships with credible media to preserve integrity—while more research is needed on how copyright affects access and outcomes in different regions.
5. Obstacles Facing AI Fact-Checking Tools in the Global South
Where large-scale AI fact-checking tools are initially optimized for English and major European languages, this can lead to gaps in coverage and effectiveness for smaller languages and markets. Both imported and locally developed systems must contend with:
- Sparse high-quality training data:
Some under-resourced or smaller languages lack sufficient volumes of credible, fact-checked journalism and digital resources, making it challenging to train robust AI models with broad coverage. - Difficulties in adapting AI models to idioms and local nuance:
Language models often struggle to interpret regional idioms, cultural references, or socio-political context, which can result in inaccurate or incomplete claim detection and verification. - Limited commercial incentives for development:
Smaller language markets and lower-resource regions often attract less investment for AI research and product development, which can further delay the creation and adoption of locally relevant fact-checking technologies. - Restrictive copyright regimes and unclear legal exceptions:
Varying national laws can limit access to training data and the permissible use of copyrighted material, creating legal uncertainties that slow the development and deployment of effective AI fact-checking tools.
6. The Differential Impact of Copyright on AI Fact-Checking Tools
Access to high-quality training data is essential for developing effective AI fact-checking tools. However, obtaining this data typically requires either formal licenses from content owners or reliance on specific copyright exceptions, allowing for legally sanctioned use. Copyright exceptions vary widely across jurisdictions.
- United States:
The US relies on the doctrine of “fair use,” which provides flexible protection for uses such as research, news reporting, and teaching. Courts have interpreted fair use to permit the use of copyrighted materials in transformative ways—including, in some cases, for training AI and text/data mining—making it easier for organizations to access and process journalistic content. - European Union:
The EU has established clear exceptions for “text and data mining” (TDM): Article 3 of the EU’s Digital Single Market (DSM) Directive grants research organizations the right to mine data for non-commercial purposes, while Article 4 permits broader TDM unless expressly reserved by rights holders. These exceptions support AI development, though some differences exist in adoption and implementation across member states. - Africa and Latin America:
Many countries in Africa and Latin America have more restrictive copyright regimes, with few explicit exceptions for research or AI use. Fact-checkers and developers may face legal uncertainty and limited access to news and other content for training and verification, hampering the development and reach of AI-powered tools.
When it comes to Generative AI more broadly, recent court decisions illustrate both possibilities and limits in use of copyright exceptions to access training data. Notably, the US case of Thomson Reuters v. Ross Intelligence found that commercial use of copyrighted legal content for AI training may not qualify as fair use, whereas Kadrey v. Meta Platforms acknowledged “highly transformative” uses may be protected. In Europe, the Hamburg Regional Court and Hungarian Municipal Court have upheld TDM exceptions for AI research under EU law in non-commercial contexts, setting important precedents for lawful data mining and access.
7. Research Agenda: Questions for Future Study
This brief analysis surfaces several areas urgently in need of deeper research and empirical investigation as AI-powered fact-checking spreads globally:
- To what extent do differences in copyright law—especially restrictive national laws or uneven regional regimes — affect the ability of fact-checking organizations to develop and access large, high-quality training datasets?
- Are licensing arrangements between news organizations and fact-checkers a key part of the solution to data access challenges, and what models are most effective in supporting equitable AI development? Are such arrangements available in all regions?
- To what extent have the US fair use and EU text and data mining exceptions assisted in the development and deployment of AI fact-checking tools?
- What risks arise from training AI models primarily on more easily available but lower-quality web content, if access to rigorous journalism is constrained by copyright or cost?
- In what ways might the growth of local AI development in the Global South offer distinctive solutions to these challenges, and what legal reforms would best support sustainable, inclusive innovation?
8. Conclusion
The propagation of AI-powered fact-checking across borders—exemplified by the evolution and international spread of major partnership models—demonstrates both promise and persistent complexity. Language coverage, training data disparities, and legal constraints play significant roles in shaping who benefits from new technologies and whose voices are heard. The differential impact of copyright law is a newly urgent research challenge, likely affecting access, effectiveness, and equity in media verification globally. Ongoing empirical study and cross-regional collaboration are needed to understand and address these issues as the AI fact-checking ecosystem continues to grow.
Key References
- Full Fact. How AI helps us detect 100,000 potential claims a day (2021):
https://fullfact.org/blog/2021/apr/ai-google-100000–claims-day/ - Africa Check. How we’re using artificial intelligence to scale up global fact-checking (2024):
https://africacheck.org/fact-checks/blog/how-were-using-artificial-intelligence-scale-global-fact-checking - JournalismAI. CheckMate: AI for Fact-Checking Video Claims (2025):
https://www.journalismai.info/blog/ai-for-factchecking-video-claims - Google Africa Blog. “Nigerian fact checkers fight election misinformation with Full Fact’s AI tools.” (2023):
https://blog.google/intl/en-africa/company-news/technology/nigerian-fact-checkers-fight-election-misinformation-with-full-facts-ai-tools/ - Poynter. “Fact-checkers use automation to maximize their impact.” (2021):
https://www.poynter.org/fact-checking/2021/fact-checkers-use-automation-to-maximize-their-impact/ - Chequeado. “Chequeabot: Investing in technology for preventing the spread of misinformation.” (2024):
https://solve.mit.edu/challenges/2024-global-economic-prosperity-challenge/solutions/87157 - Arab Fact-Checking Network. [2024 project blogs and reports]:
https://arabfcn.net/en/category/blog/ - Reuters Institute. “Generative AI is already helping fact-checkers. But it’s proving less useful in small languages and outside the West.” (2024):
https://journalismai.com/2024/04/29/generative-ai-is-already-helping-fact-checkers-but-its-proving-less-useful-in-small-languages-and-outside-the-west-reuters-institute/ - A Tale of Three Cases: How Fair Use Is Playing Out in AI Copyright Lawsuits (2025):
https://www.ropesgray.com/en/insights/alerts/2025/07/a-tale-of-three-cases-how-fair-use-is-playing-out-in-ai-copyright-lawsuits