
Most large language models are trained on English-dominant data, leaving hundreds of millions of Hausa and Yoruba speakers underserved. At Keuro Lab, we set out to change that with OribAI.
What We Built
OribAI-14B is an instruction-tuned language model built on Qwen2.5-14B-Instruct and fine-tuned using LoRA adapters (r=32, alpha=64) trained via Unsloth and TRL. Our dataset comprises 27,498 unique Hausa and Yoruba conversational pairs, carefully curated to preserve linguistic integrity and cultural nuance.
Why It Matters
Hausa is spoken by over 80 million people. Yoruba by over 50 million. Both communities deserve AI systems that understand them — not systems that translate from English first and lose meaning in the process. Building in local languages from the ground up is not just a technical choice; it is a commitment to technological sovereignty.
Local Inference
We also released a 4-bit GGUF export (Q4_K_M) that runs on consumer hardware with approximately 10 GB of RAM via llama.cpp or Ollama — no cloud required. This matters enormously in contexts where internet access is unreliable or expensive.
OribAI is open-sourced on Hugging Face. We invite researchers, developers, and linguists to build on it.

