Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

Dang, John; Singh, Shivalika; D'souza, Daniel; Ahmadian, Arash; Salamanca, Alejandro; Smith, Madeline; Peppin, Aidan; Hong, Sungjin; Govindassamy, Manoj; Zhao, Terrence; Kublik, Sandra; Amer, Meor; Aryabumi, Viraat; Campos, Jon Ander; Tan, Yi-Chern; Kocmi, Tom; Strub, Florian; Grinsztajn, Nathan; Flet-Berliac, Yannis; Locatelli, Acyr; Lin, Hangyu; Talupuru, Dwarak; Venkitesh, Bharat; Cairuz, David; Yang, Bowen; Chung, Tim; Ko, Wei-Yin; Shi, Sylvie Shang; Shukayev, Amir; Bae, Sammie; Piktus, Aleksandra; Castagné, Roman; Cruz-Salinas, Felipe; Kim, Eddie; Crawhall-Stein, Lucas; Morisot, Adrien; Roy, Sudip; Blunsom, Phil; Zhang, Ivan; Gomez, Aidan; Frosst, Nick; Fadaee, Marzieh; Ermis, Beyza; Üstün, Ahmet; Hooker, Sara

Abstract:We introduce the Aya Expanse model family, a new generation of 8B and 32B parameter multilingual language models, aiming to address the critical challenge of developing highly performant multilingual models that match or surpass the capabilities of monolingual models. By leveraging several years of research at Cohere For AI and Cohere, including advancements in data arbitrage, multilingual preference training, and model merging, Aya Expanse sets a new state-of-the-art in multilingual performance. Our evaluations on the Arena-Hard-Auto dataset, translated into 23 languages, demonstrate that Aya Expanse 8B and 32B outperform leading open-weight models in their respective parameter classes, including Gemma 2, Qwen 2.5, and Llama 3.1, achieving up to a 76.6% win-rate. Notably, Aya Expanse 32B outperforms Llama 3.1 70B, a model with twice as many parameters, achieving a 54.0% win-rate. In this short technical report, we present extended evaluation results for the Aya Expanse model family and release their open-weights, together with a new multilingual evaluation dataset m-ArenaHard.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2412.04261 [cs.CL]
	(or arXiv:2412.04261v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.04261

Computer Science > Computation and Language

Title:Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators