What’s inside the report
If you’re working on Arabic LLMs or deploying AI in Arabic-speaking markets, this report gives you a real-world view of what works and where it works best.
This report offers an independent, multi-task evaluation of 18 top-performing large language models on Arabic. Download the full report now.
This report is intended only for recipients who accessed it through their aiXplain subscription. To approve further distribution, please contact care@aixplain.com. We are happy to support your use of this report.
If you’re working on Arabic LLMs or deploying AI in Arabic-speaking markets, this report gives you a real-world view of what works and where it works best.
18 LLMs benchmarked—open and closed, including Arabic-optimized models like Fanar, ALLaM, and LFM
Tested across 11 real-world Arabic NLP tasks from QA to translation
Smaller models like ALLaM 7B and Gemma 2 often outperform much larger ones
Creative writing and text classification are less sensitive to model size or architecture