Humanoid behaviour from machines is largely thanks to advances in natural language processing. Photo: AFP
Humanoid behaviour from machines is largely thanks to advances in natural language processing. Photo: AFP
Humanoid behaviour from machines is largely thanks to advances in natural language processing. Photo: AFP
Humanoid behaviour from machines is largely thanks to advances in natural language processing. Photo: AFP

Abu Dhabi unveils world’s biggest Arabic AI language processing model


Kelsey Warner
  • English
  • Arabic

Abu Dhabi's cutting-edge research hub has unveiled the world's biggest natural language processing model for the Arabic language.

Natural language processing, or NPL, is a key part of the booming artificial intelligence sector, helping computers to decode the spoken and written word to boost the development of everything from language translation tools to Siri and Alexa-style smart assistants.

The Noor model, developed at the Technology Innovation Institute, may give the Arab world a new edge in the push to digitalise as tools like chatbots, market intelligence and machine translation skew heavily to English and Chinese-speaking markets.

The priority is to find ways for Noor to be used by companies and academics to build new tools, like to provide sentiment analysis across social media, or to develop new Arabic virtual assistants, Dr Ebtesam Almazrouei, a director at TII who led the project, told The National.

But she said a smaller version of Noor would also be made available to the public, as an open source model.

"We want [Noor] to contribute to society," she said.

The size of Noor is significant. In NLP, the size of a given model is based on the number of values that model is trained on. These values are known as parameters, and they are the building blocks of machine learning. The greater the number of parameters, the more complex and capable an NLP model is.

Before, the largest available Arabic model was AraGPT, a model trained on 1.5 billion parameters. Noor was trained on 10 billion parameters, including a dataset that combines web data with books, poetry, news articles and technical information to significantly widen the applications that can be built with it.

According to TII, it is the largest high-quality cross-domain Arabic dataset ever made.

"At the 10 billion scale, our model can tackle more advanced tasks and take in more complex instructions from humans to machines," Dr Almazrouei said.

"For instance, it can summarise texts, assist with writing — for example, a press release. Also it can be used to power more natural and effective chatbots, or even evaluate the language level of employees. This is only the start, and we want to scale to even larger and more capable models in the future."

TII, the applied research arm of Abu Dhabi's Advanced Technology Research Council, is a critical part of the UAE's efforts to diversify from a reliance on oil exports and develop a knowledge-based economy. Noor is a first step in the research hub's efforts to contribute to the wider UAE Strategy for Artificial Intelligence by accelerating the adoption and integration of AI into the wider economy.

“Our expert teams have demonstrated yet again that this region can achieve breakthrough R&D outcomes to impact the world,” said Dr Ray Johnson, chief executive of TII.

New schools in Dubai
Jewel of the Expo 2020

252 projectors installed on Al Wasl dome

13.6km of steel used in the structure that makes it equal in length to 16 Burj Khalifas

550 tonnes of moulded steel were raised last year to cap the dome

724,000 cubic metres is the space it encloses

Stands taller than the leaning tower of Pisa

Steel trellis dome is one of the largest single structures on site

The size of 16 tennis courts and weighs as much as 500 elephants

Al Wasl means connection in Arabic

World’s largest 360-degree projection surface

WHAT%20MACRO%20FACTORS%20ARE%20IMPACTING%20META%20TECH%20MARKETS%3F
%3Cp%3E%E2%80%A2%20Looming%20global%20slowdown%20and%20recession%20in%20key%20economies%3C%2Fp%3E%0A%3Cp%3E%E2%80%A2%20Russia-Ukraine%20war%3C%2Fp%3E%0A%3Cp%3E%E2%80%A2%20Interest%20rate%20hikes%20and%20the%20rising%20cost%20of%20debt%20servicing%3C%2Fp%3E%0A%3Cp%3E%E2%80%A2%20Oil%20price%20volatility%3C%2Fp%3E%0A%3Cp%3E%E2%80%A2%20Persisting%20inflationary%20pressures%3C%2Fp%3E%0A%3Cp%3E%E2%80%A2%20Exchange%20rate%20fluctuations%3C%2Fp%3E%0A%3Cp%3E%E2%80%A2%20Shortage%20of%20labour%2Fskills%3C%2Fp%3E%0A%3Cp%3E%E2%80%A2%20A%20resurgence%20of%20Covid%3F%3C%2Fp%3E%0A
Company profile

Name: Thndr

Started: October 2020

Founders: Ahmad Hammouda and Seif Amr

Based: Cairo, Egypt

Sector: FinTech

Initial investment: pre-seed of $800,000

Funding stage: series A; $20 million

Investors: Tiger Global, Beco Capital, Prosus Ventures, Y Combinator, Global Ventures, Abdul Latif Jameel, Endure Capital, 4DX Ventures, Plus VC,  Rabacap and MSA Capital

Updated: April 15, 2022, 3:49 AM