aiXplain Joins BALSAM as Founding Member to Advance Arabic NLP

The region’s leading AI organizations will develop Arabic NLP datasets to support LLM benchmarking and improvement, supercharging AI development and adoption across Arab states.

Today, we are thrilled to announce that aiXplain is one of the founding members of Benchmarking Arabic LLM Standards and Metrics (BALSAM), a consortium of leading research and governmental organizations from across the Middle East, North Africa (MENA) region. BALSAM member organizations jointly developed standardized, domain-specific datasets for LLM benchmarking and evaluation across a diverse set of Arabic NLP tasks. The research will help regional and global AI companies and developers build stronger AI applications for over 400 million Arabic speakers worldwide. It also represents a significant step for broader efforts surrounding multilingual AI tools and accessibility to AI innovation outside of the English-speaking world.

Advancing AI Benchmarking for Non-English NLP Tasks

BALSAM marks a significant milestone in the advancement of scientific evaluation of AI models for non-English NLP tasks. BALSAM has developed over 60 different tasks to assess the performance of LLMs in Arabic NLP across multiple capabilities. From mathematical problem-solving and summarization, to question generation and creative writing, these tasks go beyond existing benchmarking standards to enable comprehensive LLM benchmarking for use cases in business, academia, research, and government, among others. In all, there are 1,140 datasets. The consortium focuses on several key objectives, including:

Dataset development and curation: Pooling resources and expertise to create high-quality datasets tailored for AI testing, covering diverse domains and various Arabic dialects to enhance the robustness and versatility of LLMs.
Innovative evaluation frameworks: Establishing standardized evaluation frameworks to rigorously assess the performance of LLMs developed by consortium members, facilitating transparent comparisons and driving continuous improvement.
Ethical and responsible AI development: Prioritizing ethical considerations and responsible AI practices throughout the development process, ensuring fairness, transparency, and accountability in AI models and applications.

Datasets will be available for LLM benchmarking and evaluations through the BALSAM website and associated evaluation platform. Parties interested in benchmarking an LLM can either supply an OpenAI compatible API or onboard their models onto the aiXplain platform, where benchmarking will be performed by the BALSAM platform. Additionally, BALSAM will publish its research and methodology on arXiv, and aims to present it at ArabicNLP 2024 to support reproducibility for other languages.

Accelerating AI Use Cases Across the MENA Region

As a founding member of BALSAM, aiXplain is furthering its commitment to fostering inclusive innovation in global AI development and advancing AI adoption across the Arab world.

“By harnessing the diverse strengths and capabilities of our esteemed partners across the region, we will advance the objective evaluation and continuous improvement of Arabic AI models, unlock new AI use cases for industries across the region, and accelerate the positive impact of AI for hundreds of millions of people.”
– Hassan Sawaf, Founder & CEO of aiXplain

In addition to aiXplain, BALSAM includes the MENA region’s leading AI organizations:

King Salman Global Academy for Arabic Language (KSAA)
New York University Abu Dhabi
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
Qatar Computing Research Institute (QCRI)
Saudi Data and Artificial Intelligence Authority (SDAIA)
Qatar University
King Abdulaziz University (KAU)
King Saud University (KSU)
Bisha-University
ArbML

“Our collective efforts will not only accelerate the pace of AI innovation across the globe but also contribute to the responsible and ethical deployment of specific AI technologies for different contexts. If you can’t accurately measure AI performance, you cannot expect it to improve. Through continuous knowledge-sharing, BALSAM will serve as a hub for Arabic-based AI initiatives, and also provide a replicable model for other regions looking to support foundational AI development and localization.”
– Dr. Kareem Darwish, aiXplain Scholar

“BALSAM is a pioneering collaboration among multiple institutes with a shared interest in Arabic natural language processing and the development of robust language models that serve the Arab world. This exciting initiative brings together a team dedicated to creating a standard benchmark for Arabic AI that is fair, open, easy to use, and optimally aligned with Arabic language and culture.”
– Nizar Habash, Professor of Computer Science at NYU Abu Dhabi

“BALSAM marks an important milestone, enabling us to evaluate the capabilities of large language models across various tasks in Arabic. This can inform future research and development, thus paving the way for enhancing existing models and for building more powerful Arabic models in the future.”
– Preslav Nakov, Department Chair of Natural Language Processing, and Professor of Natural Language Processing at MBZUAI

aiXplain invites academic and governmental institutions, as well as industry partners, who share a commitment to advancing objective, responsible AI research and development to join or collaborate with BALSAM.

Advancing AI Benchmarking for Non-English NLP Tasks

Accelerating AI Use Cases Across the MENA Region

In addition to aiXplain, BALSAM includes the MENA region’s leading AI organizations:

aiXplain, Inc.