Techies decry barriers 'holding back' Rwandan-context AI
Monday, February 16, 2026

The global race towards the artificial intelligence revolution is accelerating. But one resource remains at the heart of it all, data. From chatbots to robotics, no AI system can function without large volumes of quality, structured information.

In Rwanda, innovators say reluctance and slow processes in data sharing are quietly holding back the development of homegrown AI solutions.

ALSO READ: Govt moves ahead with data sharing platform to go live in 2026

For Philbert Murwanashyaka, co-founder of Yali Labs, this challenge is both personal and costly. His team has been working to develop AI systems tailored to the Rwandan context, beginning with what he described as the country’s first Kinyarwanda AI tokenizer.

The ambition grew into building a Kinyarwanda Large Language Model(LLM) capable of human-level conversation. But progress slowed down.

"We were spending a lot of money training the model with less data, and at some point, we could fail,” Murwanashyaka said.

"Because of the lack of quality and enough data, the model could predict things that did not make sense.”

According to Murwanashyaka, access to relevant Kinyarwanda datasets, particularly from public institutions and language preservation bodies, proved difficult. While initial discussions were promising, the process often stopped midway.

"There was a willingness to provide it, but we never got it. It became a problem when it came to making the data accessible,” he said, noting that some institutions advised them to rely on publicly available online materials instead.

Yet for AI developers, scattered online content is rarely enough. Language models require structured, clean, and legally accessible datasets, especially when copyrighted books and archival materials are involved.

"Some institutions have access to books and linguistic resources we cannot get because of copyright restrictions. We just want the process of accessing data to be easier,” he added.

The financial cost has been significant. Murwanashyaka said that the team has spent close to half a million US dollars training their model so far.

"We count it as a loss because we never got the accurate model we wanted,” he explained.

"Every training cycle came with a cost. We don’t have our own GPUs here, so we had to partner with others, which increased expenses.”

ALSO READ: Rwanda urged to build home-grown AI solutions

Graphic Processing Units(GPUs) are powerful computer chips that help train artificial intelligence systems much faster. They can handle many calculations simultaneously, which makes them ideal for building complex AI models.

As a result of the setbacks, Murwanashyaka said the team stepped back to focus on strengthening their internal data collection systems as they work on the model.

"Our focus is to reach human-level conversation in Kinyarwanda. That is what is costing us a lot. But we are now improving the data side and investing more to ensure we bring at least a strong base model useful in the public sector,” he said.

The "AI-Ready” data problem

Murwanashyaka’s concerns are echoed across Rwanda’s growing AI community. Audace Niyonkuru, the Chief Executive Officer of Digital Umuganda, noted that the challenge is not only about access, but about format and structure.

"If you are to use public documents to train a language model, you need formats that are not in PDF, which is usually how most public data is stored,” he said.

For AI systems, scanned documents and non-editable PDFs are difficult to process. Developers require machine-readable formats that can be cleaned, labeled, and standardised.

Globally, data is often described as the "new oil.” In the context of AI, it is more accurate to call it the engine. Without it, even the most talented engineers cannot build competitive systems.

What is being done?

In response to these concerns, the Ministry of ICT and Innovation told The New Times that efforts are underway to address the data bottleneck, beginning with the development and launch of the National Data Sharing Policy in 2025. According to the ministry, the policy established the legal framework guiding how data moves across sectors.

To strengthen implementation, the government also launched a dedicated Data Governance Unit within the National Institute of Statistics of Rwanda. The ministry noted that the unit’s role is to standardize data formats for interoperability, ensure compliance with the Data Protection and Privacy Law, and work directly with stakeholders to resolve bottlenecks.

ALSO READ: Rwanda urged to build home-grown AI solutions

Officials acknowledged that while significant volumes of data exist across government institutions, much of it remains in analog or unstandardised formats.

"As a first step, we are conducting scoping work to understand what Kinyarwanda data already exists, in what formats it is stored, and what standards and processes are required to make it usable,” the ministry stated.

One collaboration highlighted is with the Rwanda Cultural Heritage Academy, which holds extensive linguistic and cultural materials in Kinyarwanda and has digitized part of its collection. Officials see this as a potential foundation for structured language datasets.

In line with the National AI Policy and the 2025 National Data Sharing Policy, the Ministry and RISA are working to expand open-source Kinyarwanda datasets in text, audio, and other formats, while ensuring compliance with the Data Protection and Privacy Law.

"The goal is to move from fragmented data to well-curated, AI-ready datasets that developers, researchers, and startups can responsibly build on,” the ministry said.

A centralised platform in the final stages

According to the ministry, a Centralized Data Sharing Platform is in its final stages of development.

"This platform will allow all stakeholders, including private companies and startups, to discover, request, and access high-quality public-sector datasets through a streamlined interface,” the ministry explained.

"Security is our top priority; the platform uses modern encryption and anonymization protocols to ensure compliance with our Data Protection and Privacy laws. We are currently in the phase of onboarding anchor datasets to serve as a pilot before the official launch.”

The collaboration opportunity

The ministry emphasized that collaboration between public institutions, private actors, and academia could unlock greater value for the country’s digital economy.

Such partnerships, the ministry said, would enable startups to build more accurate AI models that solve local problems, while turning Rwanda into what they described as a "living lab” for researchers to test and validate solutions using real-world data.

"Ultimately, this collaboration improves service delivery by ensuring that public resources are allocated based on evidence, reducing waste and making government services more responsive to the needs of every citizen.”