News

Home / News and Events / News Blog
Ethical Sourcing of African Language Data: Lanafrica and the NOODL licence

17 August, 2025/

Over 2,000 African languages are spoken by approximately 1.4 billion people on the continent, showing how linguistic diversity underpins African democracy, development, and cultural life. As artificial intelligence becomes central to progress in areas like healthcare, agriculture and education, new methods for collecting and sharing African language data are urgently...

LATAM-GPT: A Culturally Sensitive Large Language Model for Latin America

28 June, 2025/

LATAM-GPT is a groundbreaking large language model developed by the National Center for Artificial Intelligence (CENIA) in Chile, in partnership with over thirty institutions and twelve Latin American countries. The initiative aims to create an open-source AI model that reflects the region’s diverse cultures, languages—including Spanish, Portuguese, and Indigenous tongues—and...

  • All Posts
  • Case Studies
Ethical Sourcing of African Language Data: Lanafrica and the NOODL licence

Over 2,000 African languages are spoken by approximately 1.4 billion people on the continent, showing how linguistic diversity underpins African democracy, development, and cultural life. As artificial intelligence becomes central to progress in areas like healthcare, agriculture and education, new methods for collecting and sharing African language data are urgently...

The Global Evolution of AI Fact-Checking: Copyright and Research Gaps

Abstract The propagation and evolution of AI-powered fact-checking tools worldwide has foregrounded the issue of access to quality training data. While many of the most widely adopted systems have originated in the Global North, there are also notable and growing efforts in the Global South to develop and adapt fact-checking...

LATAM-GPT: A Culturally Sensitive Large Language Model for Latin America

LATAM-GPT is a groundbreaking large language model developed by the National Center for Artificial Intelligence (CENIA) in Chile, in partnership with over thirty institutions and twelve Latin American countries. The initiative aims to create an open-source AI model that reflects the region’s diverse cultures, languages—including Spanish, Portuguese, and Indigenous tongues—and...

Blind South Africa: Apps for the Visually Impaired

An initiative led by Christo de Klerk at Blind South Africa focuses on promoting the use of accessible mobile and digital applications for blind and visually impaired people in South Africa. Highlighted at the 2025 Copyright and the Public Interest in Africa conference, the project addresses the importance of robust...

A Talking Health Chatbot in African Languages: DSFSI, University of Pretoria

A project at the Data Sciences for Social Impact (DSFSI) group, University of Pretoria, led by Professor Vukosi Marivate, is developing a talking health chatbot in African languages to provide accessible, culturally relevant health information to underserved communities. Central to the initiative is the planned use of health actuality TV...

Masakhane: Use of the JW300 Dataset for Natural Language Processing

The Masakhane Project showcases the transformative power of open, collaborative efforts in advancing natural language processing (NLP) for African languages. However, its reliance on the JW300 dataset—a vast multilingual corpus primarily comprising copyrighted biblical translations—uncovered significant legal and ethical challenges. These challenges focused on copyright restrictions, contract overrides, and the...

Promoting AI for Good in the Global South – Highlights

Across Africa and Latin America, researchers are using Artificial Intelligence to solve pressing problems: from addressing health challenges and increasing access to information for underserved communities, to preserving languages and culture. This wave of “AI for Good” in the Global South faces a major difficulty: how to access good quality...

Scroll to Top