Amr Keleg

Affiliation

The university of Edinburgh

Edinburgh, Scotland

Hello (أهلًا وسهلًا)! 👋👋

My name is Amr Keleg عمرو قلج (/ʕamr/ /kɯˈɫɯtʃ/). I am a PhD student (CDT in NLP) at the University of Edinburgh, working under the supervision of Walid Magdy and Sharon Goldwater. I am currently studying the variation across and between the Arabic dialects, their mutual intelligibility, and the implications of this variation on the creation of multi-dialect Arabic datasets.

Feel free to ping me if you are interested in discussing ideas related to the following interests, and/or collaborating on that!

Research Interests

1) Computationally Handling Dialectal Variation (focusing on Arabic)

2) Multilingual and Multicultural research

3) Analyzing Romanized Writings of Non-Latin Languages

  • I developed a rule-based tool transliterating Arabizi (Romanized Arabic) into Arabic script.

Other Interests

  • As an undergraduate student, I was a competitive programming addict (lots of fun experiences 😄). I am also an advocate of open-sourcing data/models/projects (twice a Google Summer of Code student for Apertium, and GNU Octave + contributor to other projects like Facebook/Duckling).

News

Oct 31, 2024 Presented my work to CAMel Lab. Check the slides: here. Thanks, Nizar, for the invitation!
Sep 9, 2024 Attended the GAIN summit in Riyadh, and visited SDAIA for two weeks. Thanks Dr. Ahmed Ali for the invitation.
Aug 14, 2024 Our paper “Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets” got an Outstanding Paper Award 🎖️🎖️🎖️
Jul 1, 2024 Gave an online talk to the ARBml community under the title Distinguishing between the Varieties of Arabic: Dialect Identification is nether Solved nor the Solution.. Check the slides: here.
May 15, 2024 Had a short paper “Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets” accepted to ACL 2024 🎉🎉 See you in Thailand!
More news...

Selected Publications

  1. Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets
    Keleg, Amr, Magdy, Walid, and Goldwater, Sharon
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2024
    ACL 2024 Outstanding Paper Award
  2. NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task
    Abdul-Mageed, Muhammad, Keleg, Amr, Elmadany, AbdelRahim, Zhang, Chiyu, Hamed, Injy, Magdy, Walid, Bouamor, Houda, and Habash, Nizar
    In Proceedings of The Second Arabic Natural Language Processing Conference 2024
  3. ALDi: Quantifying the Arabic Level of Dialectness of Text
    Keleg, Amr, Goldwater, Sharon, and Magdy, Walid
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing 2023
  4. Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification
    Keleg, Amr, and Magdy, Walid
    In Proceedings of ArabicNLP 2023 2023
  5. DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models
    Keleg, Amr, and Magdy, Walid
    In Findings of the Association for Computational Linguistics: ACL 2023 2023
  6. SMASH at Qur’an QA 2022: Creating Better Faithful Data Splits for Low-resourced Question Answering Scenarios
    Keleg, Amr, and Magdy, Walid
    In Proceedings of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and Fine-Grained Hate Speech Detection 2022