Amr Keleg
MBZUAI
Abu Dhabi, UAE
Hello (أهلًا وسهلًا)! 👋👋
My name is Amr Keleg عمرو قلج (/ʕamr/ /kɯˈɫɯtʃ/). I am a postdoctoral researcher at MBZUAI, working on cultural alignment with Fajri Koto. I have recently defended my PhD at the University of Edinburgh, which was done under the supervision of Walid Magdy and Sharon Goldwater. My PhD focused on studying the variation across and between the Arabic dialects, their mutual intelligibility, and the implications of this variation on the creation of multi-dialect Arabic datasets.
Feel free to ping me if you are interested in discussing ideas related to the following interests, and/or collaborating on that!
Research Interests
1) Computationally Handling Dialectal Variation (focusing on Arabic)
-
Revisiting Common Assumptions about Arabic Dialects in NLP - ACL 2025
Keleg, Amr, Goldwater, Sharon, and Magdy, Walid -
Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets - ACL 2024 (Outstanding Paper award, Oral presentation)
Keleg, Amr, Magdy, Walid, and Goldwater, Sharon -
NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task - ArabicNLP 2024 (Shared Task Organization, co-located with ACL 2024)
Abdul-Mageed, Muhammad, Keleg, Amr, Elmadany, AbdelRahim, Zhang, Chiyu, Hamed, Injy, Magdy, Walid, Bouamor, Houda, and Habash, Nizar -
ALDi: Quantifying the Arabic Level of Dialectness of Text - EMNLP 2023
Keleg, Amr, Goldwater, Sharon, and Magdy, Walid -
Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification - ArabicNLP 2023 (Oral presentation, co-located with EMNLP 2023)
Keleg, Amr, and Magdy, Walid
2) Multilingual and Multicultural research
-
LLM Alignment for the Arabs: A Homogenous Culture or Diverse Ones? - C3NLP (co-located with NAACL 2025)
Keleg, Amr -
DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models - ACL 2023 (Findings)
Keleg, Amr, and Magdy, Walid -
An Unsupervised Method for Weighting Finite-state Morphological Analyzers - LREC 2020
Keleg, Amr, Tyers, Francis, Howell, Nick, and Pirinen, Tommi
3) Analyzing Romanized Writings of Non-Latin Languages
- I developed a rule-based tool transliterating Arabizi (Romanized Arabic) into Arabic script.
Other Interests
As an undergraduate student, I was a competitive programming addict (lots of fun experiences 😄). I am also an advocate of open-sourcing data/models/projects (twice a Google Summer of Code student for Apertium, and GNU Octave + contributor to other projects like Facebook/Duckling).
Note: The list of resources created as part of my research can be accessed through this page.
News
| Oct 6, 2025 | Joined MBZUAI as a postdoctoral researcher, working with Fajri Koto on cultural alignment. |
|---|---|
| Aug 20, 2025 | Successfully passed my PhD viva (minor comments) examined by Mona Diab and Edoardo Ponti |
| Jul 14, 2025 | Our paper “Revisiting Common Assumptions about Arabic Dialects in NLP“ got accepted to ACL 2025! |