- | 9:00 am
Why voice assistants in the Middle East will need cultural intelligence
In the Middle East, AI-powered voice assistants must grasp nuance, speak with care and be culturally sensitive.

With the landscape of generative AI shifting, voice assistants are becoming increasingly popular in our daily routines. However, in regions like the Middle East, where the Arabic language has many regional dialects, each with its unique vocabulary, expressions, and pronunciation, it takes more than just crisp speech recognition to win users over.
In particular, Khaleeji Arabic—the dialect spoken across Gulf countries like the UAE, Saudi Arabia, Kuwait, and Bahrain—adds a layer of complexity. It includes unique vocabulary, pronunciation, and expressions not found in Modern Standard Arabic (MSA) or other regional dialects. For AI to truly resonate, it must understand what users say, how they say it, and what that means in context.
Put simply, they need to be significantly more intuitive and responsive, have better language understanding, and be culturally sensitive.
WHY LOCALIZATION MATTERS
A recent survey conducted by Researchscape International and commissioned by global tech company Yango revealed that 92% of respondents in the UAE would prefer a smart/AI assistant specifically designed for the Middle East. This preference underscores the desire for features that align with the region’s cultural and linguistic nuances.
Notably, 66% of respondents emphasized the importance of the assistant’s ability to answer questions about Arabic culture, literature, and traditions.
“Spoken expressions often carry meanings far beyond their literal words,” says Deepak Gupta, an AI and data expert based in Abu Dhabi. He points out that Arabic phrases like “Inshallah” can indicate everything from hope to polite doubt, depending on context. Another phrase, “Mashallah”, might express admiration or sarcasm, depending on tone.
A voice assistant that misreads these subtleties could respond inappropriately, damaging user trust. “Accounting for regional expressions and tone makes voice assistants more natural, accurate, and trustworthy,” he adds.
For voice assistants, this means understanding not only different languages but also dialects, cultural references, and even accents.
Rami Abu Arja, of Yasmina—an Arabic-speaking AI voice assistant developed by Yango Group—aims to make interactions with voice assistants like conversations with friends or family members. “You can speak to Yasmina just like you’d talk to your brother or sister,” he says. Use your normal tone, mix in some English or slang—she understands.”
It isn’t just fluency in Khaleeji Arabic and English—Yasmina is trained to understand the local culture. Abu Arja says,“She knows what karak is. She knows where to find it. She understands the jokes and daily habits of people here.”
Meanwhile, dubbed VoxArabica, researchers introduced significant advancements in Arabic speech recognition by developing a system capable of identifying 17 different Arabic dialects in addition to MSA.
Democratizing access, VoxArabica employs models such as HuBERT for dialect identification and Whisper and XLS-R for automatic speech recognition. To enhance accuracy, these models were fine-tuned on various Arabic dialects, including Egyptian and Moroccan. The system also offers zero-shot support for additional dialects, making it adaptable to the Middle East’s linguistic diversity.
In Yasmina’s case, for localization, the team spent two years consulting with local creatives, linguists, and technologists to ensure the assistant could interpret dialects and respond naturally. “We substituted the lack of Arabic data online with actual people who could share their knowledge,” says Abu Arja. Over 1,000 users tested Yasmina in real-world home environments to fine-tune her responses and grasp of both Khaleeji Arabic and English for the UAE market.
CULTURAL GROUNDING
Gupta believes cultural grounding is critical. “The assistant should feel inclusive,” he says. That means respecting traditions but not making assumptions. Some users may want formal greetings like “As-salamu alaykum”, while others prefer casual interactions. Rather than reinforce stereotypes, voice assistants should ask users about their preferences or offer customizable options.
Privacy is also a key concern. In an era where AI devices are often viewed skeptically, voice assistants must earn user trust. Yasmina, for instance, only activates when her name is called, and includes a mute button for complete control. Abu Arja says, “We’re not just teaching machines to talk. We’re teaching them to belong.”
In a region where meaning is embedded in culture as much as in language, voice assistants that understand the full picture will find a lasting place in people’s lives.