Blog
From the Bhala Team
Language tech, product launches, engineering notes.
A 15M Zulu-only model beats GPT-4o on Swahili — and understands Korean without ever seeing it
A 15M-parameter encoder pretrained on isiZulu — and nothing else — reaches 73.2% on Swahili intent (above GPT-4o zero-shot at 70.6%) and 72.5% on Korean using only a linear probe on the frozen encoder. Korean has nothing in common with Zulu — different family, different script, never seen in pretraining. By the strictest version of the field's gold-standard test (frozen encoder + linear probe + zero target-language data), this is the strongest published cross-lingual transfer result we know of.
Silver labels are noisy by design. Bhala's audit catches the worst of them — top-10 precision: 100%.
Every production NLP team is sitting on silver-labeled training data — auto-tagged at scale, noisy by design. Bhala's audit tool surfaces the real mislabels in those corpora using just 100 hand-curated seeds and zero sentiment supervision. Top-10 precision on held-out validation: 100%. AUROC: 0.732. The same seeds curated in one Bantu language transfer cross-lingually to surface clear errors and policy-boundary cases in another Bantu language with no extra supervision. The product play: 5–10× reviewer-time multiplier across the AI lifecycle.
RLHF makes LLM bias invisible. Here's what it actually looks like underneath.
Most fairness audits read a model's outputs and conclude. We probed inside six open LLMs — LLaMA-2 7B (base and chat), Mistral 7B (base and instruct), Phi-2 2.7B, and InkubaLM 0.4B — at the encoder level, the actual structure that drives every downstream task. The model that looks cleanest at the output (InkubaLM 0.4B, biased on 7 of 28 axes via PLL) and the surface-clean RLHF-tuned models all show the same bias structure once you go past the output gate. RLHF hides bias from the audit; it doesn't remove it from the encoder. If your product wraps any of these models for retrieval, search, or classification, the encoder is what you ship — and that's where the bias still is.
We audited Gemma 4. The bias didn't go away — it went into hiding.
Standard fairness audits call Gemma 4 clean. We ran a stronger one and found bias intact in all 28 protected dimensions we tested. Here's what it means for your deployment, how to audit any open model the same way, and a live API you can paste into a terminal right now to flip a biased sentence.