I said camel, not ostrich! Why AI makes such a meal of Arabic words

If you can tell the difference between a large mammal with a hump and an ungainly flightless bird, you're more switched on than some AI models.

Researchers in the UAE have found that even AI models optimised for use with Arabic struggle to understand content from the Arab world. They may think a camel is an ostrich, say a traditional Moroccan hat is a Mexican sombrero, or fail to properly identify a popular Gulf dessert.

While the errors may raise a laugh, they highlight the serious point that AI often makes mistakes when it comes to dealing with material from the Arab world, including images. For AI developers, there is a reputational risk in ignoring cultural nuances, giving broad-brush answers, or, in the worst case, getting things wrong completely.

Researchers from Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in Abu Dhabi recorded the errors when they tested half a dozen “prominent” AI models, five of them designed to work with Arabic. They are vision-language models, so they can interpret video, pictures or text and produce a written description or summary.

The five Arabic-specialised models – all open-source models that can be used freely by the public without payment – include some produced by researchers in the UAE and others created by start-up technology companies. One of the researchers, Karima Kadaoui, said the models sometimes knew “a bit about Arab culture” without understanding the cultural specifics.

“Let’s say there’s an instrument that is very specific to a culture, and the models would either describe it in very vague terms, just saying an instrument instead of using the actual name, or end up misattributing it completely to some different culture,” she said.

“For example, there’s a lady wearing a northern Moroccan hat. The majority of models would simply not recognise it as anything – they would say something vague like just ‘hat’, and then some models would be over-confident and claim that it belongs to a different culture. In this example, a Mexican hat.”

Ms Kadaoui, a PhD student, told The National that models would see a camel and mistake it for an ostrich “very commonly”. The sweet dessert Omani halwa was mistakenly identified in various ways by different models. One called it a sweet that “could be a cake” covered in nuts, while another suggested that it was a Pastilla, a Moroccan pie, and a third described it as a type of baklava dessert.

The researchers also found what they described as dialectical mixing, such as AI starting a response with Moroccan Arabic, but mixing in words from Egyptian Arabic. When giving responses that were supposed to be in a particular dialect, models often reverted to Modern Standard Arabic.

In a statement, the university said such examples highlight the challenges for Arabic-enabled AI, which include correctly recognising an object, understanding its cultural context, correctly using a particular Arabic dialect and giving coherent responses.

Although the research focused on Arabic-enabled AI and objects from the Arab world, the researchers said that mistakes could also occur when AI was identifying material from other areas. They gave South-East Asia as an example.

Researchers found that AI models mistook traditional Moroccan hats for Mexican sombreros. Getty Images

Sometimes AI models fail when interpreting material from the Arab world because the initial annotation of images, for example, has been done by people from other parts of the world. This initial annotation by a person is crucial because it provides the initial data on which the model is built. Models may become more culturally grounded if, for example, they use more inclusive or diverse human-annotated data sets.

The researchers detailed their findings in a paper presented to the European Chapter of the Association for Computational Linguistics at a conference in Morocco in March. The paper was written by seven MBZUAI researchers and three from the company Toloka AI.

Another author, Hamdan Al Ali, a PhD student at the university, said that part of the reason models misidentify images is because of the way they analyse pictures.

“It’s statistics, it’s seeing what is the most probable,” he said. “So, for example, the image of the camel was sometimes identified as a llama. Models say, ‘There are four different points of this animal touching the ground. It has this specific colour – brownish colour,’ and so on, and based on that, it identifies it.

“The points that a human would look at to identify this as a camel, this as a llama, are different. A human would look at the hump on top of the camel.”

Ms Kadaoui said the research could encourage AI developers to produce models that are more culturally astute.

“It’s entering the conversation on making these models more inclusive,” she said. “You have to ask for inclusivity; it’s rarely something that comes by itself. You have to fight for it, you have to demand it and you have to keep doing work that exposes the gaps and the bias.”

I said camel, not ostrich! Why AI makes such a meal of Arabic words

UAE researchers find basic mistakes among Arabic AI models, often stemming from a lack of cultural understanding