Advancing Arabic ASR for Disordered Speech: Fine-Tuning Wav2Vec2 on Egyptian Dysarthric Speech
DOI:
https://doi.org/10.58445/rars.2975Keywords:
Automatic Speech Recognition, Arabic ASR, Wav2Vec2, Dysarthric Speech, Egyptian Arabic, Low-Resource Languages, Speech Recognition Fine-TuningAbstract
Despite significant advances in Automatic Speech Recognition (ASR), its application to low-resource languages such as Arabic—especially for speakers with speech disorders—remains underdeveloped. This study presents a novel approach to Arabic ASR for disordered speech by fine-tuning a Wav2Vec2 model on a personalized dataset comprising approximately 1,300 utterances from an Egyptian Arabic speaker with speech impairments. Building on the comparative foundation set by Alsohby (2025), which evaluated four state-of-the-art ASR models across general, dysarthric, and accented speech, we extend the analysis through specialized model adaptation. Our methodology encompasses data preprocessing, fine-tuning, and evaluation using Word Error Rate (WER) and Character Error Rate (CER). Results indicate a substantial performance gain, reducing WER from 0.8516 to 0.3736 and CER from 0.5756 to 0.3478. These findings demonstrate the effectiveness of personalized fine-tuning and underscore the critical need for diverse, domain-specific datasets to improve ASR accessibility for Arabic speakers with speech impairments.
References
Alsohby, I. (2025). Comprehensive Analysis of Foundation ASR Model Performance: A Comparative Study of Conformer, HuBERT, Wav2Vec2, and Whisper with Insights into Dysarthric, Accented, and General Speech Recognition. Zenodo. https://doi.org/10.5281/zenodo.15459146
Alotaibi, Y., & Alotaibi, M. (2022). Arabic Automatic Speech Recognition: A Systematic Literature Review. MDPI. https://www.mdpi.com/2076-3417/12/17/8898
Abushariah, M. et al. (2024). Modern Standard Arabic Speech Disorders Corpus. International Journal of Speech Technology. https://link.springer.com/article/10.1007/s10772-023-10093-0
Qian, Z. et al. (2023). A survey of technologies for automatic dysarthric speech recognition. EURASIP Journal on Audio, Speech, and Music Processing. https://doi.org/10.1186/s13636-023-00318-2
MacDonald, B. et al. (2021). Personalized ASR Models from a Large and Diverse Disordered Speech Dataset. Google Research. https://blog.research.google/2021/08/personalized-asr-models-from-large-and.html
Downloads
Posted
Categories
License
Copyright (c) 2025 Islam Alsohby

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license