Lisa Udechukwu, a data quality analyst whose work focuses on AI annotation, data integrity, and trustworthy AI evaluation, says the AI Systems failing many global users have a data problem and not a technology problem
Udechukwu, also an executive member of Africa Privacy Roundup, an African-led organization focused on data protection and AI governance, told BusinessDay that when the data used to train AI ignores most of the world, the failures aren’t bugs. They’re built in.
“In 2015, Google Photos automatically tagged photos of two Black people as ‘gorillas.’ The company’s fix, years later, was to remove the categories ‘gorilla,’ chimp,’ and monkey’ from its image classifier entirely, not to fix the underlying data. The problem wasn’t a glitch. It was a symptom.
“I’ve spent years working inside annotation pipelines at companies like Pinterest and Meta — the unglamorous infrastructure layer where human judgment gets converted into training signals for machine learning models.
“And what I’ve seen, consistently, is that the data problem in AI is not a resource problem. It’s a representation problem. And it’s more systematic than most people building these systems want to admit,” she explained.
She stressed that most major AI datasets are heavily concentrated in Western, English-language contexts.
The ImageNet dataset — foundational to a decade of computer vision — drew over 40% of its images from the United States alone. African countries, which account for 17% of the global population, collectively contributed less than 1%.
Language model benchmarks follow the same pattern: English dominates, followed by a handful of European languages, with vast multilingual regions effectively absent.This matters in ways that go far beyond misclassified photos.
She posted that when AI systems trained on unrepresentative data are deployed into global markets, which they routinely are, they carry their blind spots with them. Hiring algorithms that misread non-Western names, medical AI that underperforms on skin tones that weren’t in the training set.
Search and recommendation systems that misinterpret user intent because cultural context wasn’t part of the label design.
These issues are often treated as edge cases, but many are actually predictable outcomes of incomplete and unrepresentative training data. And yet they rarely appear in model performance reports, because the benchmarks used to evaluate ‘accuracy’ are
themselves built on the same skewed data foundations.
“In my work managing annotation pipelines — overseeing quality across thousands of human-labeled data points — I’ve seen how cultural blind spots enter training data not through malice, but through the quiet assumptions embedded in labeling guidelines.
“When an annotation task asks workers to judge whether a search result is ‘relevant,’ the definition of relevance is written by someone. That someone is almost always located in a high-income, English-speaking country.
“The resulting guidelines can work reasonably well for users who look, speak, and search like the guideline-writer. For everyone else, the signal degrades — and that degradation rarely surfaces until a model fails loudly in a market the company cares about.
“Three patterns repeat across nearly every pipeline I’ve worked in: inconsistent labeling when cultural context is ambiguous and guidelines don’t account for it; systematic misclassification in product categories or content types that are common outside the West but were never included in taxonomy design; and relevance scoring that quietly penalizes regional expression, local idiom, and non-standard syntax,” she added.
Individually, these look like quality issues. Collectively, they are a structural gap in how AI is built.The Culturally Contextual Datasheet: A Framework for What’s missing in my published research in AI and Ethics (Springer), I introduced the Culturally Contextual Datasheet (CCD) — a framework designed to document not just what data a dataset contains, but what cultural assumptions are embedded in how it was collected, labeled, and defined.
According to her, standard data documentation — model cards, datasheets for datasets — asks questions like: Where was this data collected? How many examples are there? What are the known limitations?
These are necessary. But they don’t ask: Whose definition of ‘correct’ was used in labeling? Whose dialect was treated as standard? Whose concept of relevance, safety, or appropriateness shaped the annotation guidelines?
The CCD framework treats those questions as first-class documentation requirements. Because until organisations are required to disclose the cultural assumptions baked into their data, there’s no accountability for the outcomes those assumptions produce.
Udechukwu, whose research on the Culturally Contextual Datasheet (CCD) framework, which explores cultural context and accountability in AI datasets, is published in AI and Ethics (Springer), said the people most harmed by culturally incomplete AI data are rarely the people building the systems.
A facial recognition system that misidentifies black individuals isn’t used by its engineers in their daily lives. A medical AI that underperforms on darker skin tones doesn’t affect the dermatologists who deployed it.
A content moderation system that over-removes posts in Arabic, Yoruba, or Tagalog doesn’t silence the moderators writing the policy.
“As an executive member of Africa Privacy Roundup — an African-led organization focused on data protection and AI governance across the continent — I see this unevenness clearly. African users and communities are increasingly the subjects of AI systems they had no role in training, no voice in designing, and no recourse when those systems fail them.
“That is not just a technical problem. It is a governance problem and an ethical one. The EU AI Act, now in enforcement, requires transparency and risk assessment for high-stakes AI systems. But it was written largely for a European context, with European data realities in mind.
“The Global South needs equivalent frameworks — and the organizations deploying AI in those regions need to be held to equivalent standards, not exempted because the markets are considered ’emerging.’
Fixing culturally incomplete AI data doesn’t require tearing systems down. It requires changing what we measure and who we involve. Organizations building AI need to audit their annotation guidelines for cultural assumptions — not just their datasets for demographic balance.
They need to invest in annotator communities that reflect the populations their models serve, not just the populations that are easy to source at scale. And they need to treat cultural documentation as a core deliverable of model development, not an afterthought.
For the companies deploying AI in globalu markets, the business case is clear: models trained on unrepresentative data fail more frequently, in more expensive ways, in the markets that represent the most growth. Inclusive data is not a values exercise. It is a quality imperative.
“The future of AI will not be determined by who builds the most models. It will be determined by who builds the most reliable ones — and reliability, at scale, requires data that reflects the full complexity of the world those models are asked to serve,” Udechukwu who is also exploring initiatives centered on culturally aware AI evaluation and governance systems,” highlighted.
Join BusinessDay whatsapp Channel, to stay up to date
Open In Whatsapp
