Amr Keleg

Affiliation

The university of Edinburgh

Edinburgh, Scotland

Hello (أهلًا وسهلًا)! 👋👋

My name is Amr Keleg عمرو قلج (/ʕamr/ /kɯˈɫɯtʃ/). I am a PhD student (CDT in NLP) at the University of Edinburgh, working under the supervision of Walid Magdy and Sharon Goldwater. I am currently studying the variation across and between the Arabic dialects, their mutual intelligibility, and the implications of this variation on the creation of multi-dialect Arabic datasets.

Feel free to ping me if you are interested in discussing ideas related to the following interests, and/or collaborating on that!

Research Interests

1) Computationally Handling Dialectal Variation (focusing on Arabic)

2) Multilingual and Multicultural research

3) Analyzing Romanized Writings of Non-Latin Languages

  • I developed a rule-based tool transliterating Arabizi (Romanized Arabic) into Arabic script.

Other Interests

As an undergraduate student, I was a competitive programming addict (lots of fun experiences 😄). I am also an advocate of open-sourcing data/models/projects (twice a Google Summer of Code student for Apertium, and GNU Octave + contributor to other projects like Facebook/Duckling).

News

Mar 1, 2025 My position paper “LLM Alignment for the Arabs: A Homogenous Culture or Diverse Ones” is accepted to the C3NLP workshop co-located with NAACL 2025!
Oct 31, 2024 Presented my work to CAMel Lab. Check the slides: here. Thanks, Nizar, for the invitation!
Sep 9, 2024 Attended the GAIN summit in Riyadh, and visited SDAIA for two weeks. Thanks Dr. Ahmed Ali for the invitation.
More news...

Selected Publications

  1. LLM Alignment for the Arabs: A Homogenous Culture or Diverse Ones?
    Keleg, Amr
    In Proceedings of the 3rd Workshop on Cross-Cultural Considerations in NLP 2025
  2. Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets
    Keleg, Amr, Magdy, Walid, and Goldwater, Sharon
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2024
    ACL 2024 Outstanding Paper Award
  3. NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task
    Abdul-Mageed, Muhammad, Keleg, Amr, Elmadany, AbdelRahim, Zhang, Chiyu, Hamed, Injy, Magdy, Walid, Bouamor, Houda, and Habash, Nizar
    In Proceedings of The Second Arabic Natural Language Processing Conference 2024
  4. ALDi: Quantifying the Arabic Level of Dialectness of Text
    Keleg, Amr, Goldwater, Sharon, and Magdy, Walid
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing 2023
  5. DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models
    Keleg, Amr, and Magdy, Walid
    In Findings of the Association for Computational Linguistics: ACL 2023 2023