Arabic Natural Language Processing Workshop

WANLP 2015

Workshop on Arabic Natural Language Processing 
Including the Second Shared Task on Automatic Arabic Error Correction

Collocated with ACL-IJCNLP 2015, Beijing, China 


11 January 2015: First Call for Workshop Papers

19 February 2015:  Second Call for  Workshop Papers

21 May 2015 (was 14 May 2015): Workshop Paper Due Date

8 June 2015 (was 4 June 2015): Notification of Acceptance

21 June 2015: Camera-ready papers due

30 July 2015: Workshop Dates

Workshop Website:  

For Shared Task important dates, see below.


There has been a lot of progress in the last 15 years in the area of Arabic Natural Language Processing (NLP).  Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. This workshop follows in the footsteps of previous efforts to provide a forum for researchers to share and discuss their ongoing work.

We invite submissions on topics that include, but are not limited to, the following: 
  • Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, sentiment analysis, Arabic dialect modeling, etc.
  • Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media, etc.
  • Resources: dictionaries, annotated data, specialized databases etc.
Submissions may include work in progress as well as finished work. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work.  Submissions are expected to be 8 pages long plus 2 pages for references. Associated with the workshop will be a shared task on Arabic text error correction (details below).


    Main Workshop Papers

EDRAK: Entity-Centric Data Resource for Arabic Knowledge
Mohamed H. Gad-elrab, Mohamed Amir Yosef and Gerhard Weikum 

POS-tagging of Tunisian Dialect Using Standard Arabic Resources and Tools
Ahmed Hamdi, Alexis Nasr, Nizar Habash and Nuria Gala 

Joint Arabic Segmentation and Part-Of-Speech Tagging
Shabib AlGahtani and John McNaught 

A Conventional Orthography for Algerian Arabic
Houda Saadane and Nizar Habash 

Deep Learning Models for Sentiment Analysis in Arabic
Ahmad Al Sallab, Hazem Hajj, Gilbert Badaro, Ramy Baly, Wassim El Hajj and Khaled Bashir Shaban 

Annotating Targets of Opinions in Arabic using Crowdsourcing
Noura Farra, Kathy McKeown and Nizar Habash 

Best Practices for Crowdsourcing Dialectal Arabic Speech Transcription
Samantha Wray, Hamdy Mubarak and Ahmed Ali 

A Light Lexicon-based Mobile Application for Sentiment Mining of Arabic Tweets
Gilbert Badaro, Ramy Baly, Rana Akel, Linda Fayad, Jeffrey Khairallah, Hazem Hajj, Khaled Shaban and Wassim El-Hajj 

Multi-Reference Evaluation for Dialectal Speech Recognition System: A Study for Egyptian ASR
Ahmed Ali, Walid Magdy and Steve Renals 

DIWAN: A Dialectal Word Annotation Tool for Arabic
Faisal Al-Shargi and Owen Rambow 

Classifying Arab Names Geographically
Hamdy Mubarak and Kareem Darwish 

Robust Part-of-Speech Tagging of Arabic Text
Hanan Aldarmaki and Mona Diab 

Answer Selection in Arabic Community Question Answering: A Feature-Rich Approach
Yonatan Belinkov, Alberto Barrón-Cedeño and Hamdy Mubarak 

Natural Language Processing for Dialectical Arabic: A Survey
Abdulhadi Shoufan and Sumaya Alameri 

A Pilot Study on Arabic Multi-Genre Corpus Diacritization
Houda Bouamor, Wajdi Zaghouani, Mona Diab, Ossama Obeid, Kemal Oflazer, Mahmoud Ghoneim and Abdelati Hawwari 

    Shared Task Papers

The Second QALB Shared Task on Automatic Text Correction for Arabic
Alla Rozovskaya, Houda Bouamor, Nizar Habash, Wajdi Zaghouani, Ossama Obeid and Behrang Mohit 

QCRI@QALB-2015 Shared Task: Correction of Arabic Text for Native and Non-Native Speakers’ Errors
Hamdy Mubarak, Kareem Darwish and Ahmed Abdelali 

Arib@QALB-2015 Shared Task: A Hybrid Cascade Model for Arabic Spelling Error Detection and Correction
Nouf AlShenaifi, Rehab AlNefie, Maha Al-Yahya and Hend Al-Khalifa 

SAHSOH@QALB-2015 Shared Task: A Rule-Based Correction Method of Common Arabic Native and Non-Native Speakers’ Errors
Wajdi Zaghouani, Taha Zerrouki and Amar Balla 

GWU-HASP-2015@QALB‐2015 Shared Task: Priming Spelling Candidates with Probability
Mohammed Attia, Mohamed Al-Badrashiny and Mona Diab 

QCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction
Houda Bouamor, Hassan Sajjad, Nadir Durrani and Kemal Oflazer 

UMMU@QALB-2015 Shared Task: Character and Word level SMT pipeline for Automatic Error Correction of Arabic Text
Fethi Bougares and Houda Bouamor 

TECHLIMED@QALB-Shared Task 2015: a hybrid Arabic Error Correction System
Djamel MOSTEFA, Jaber ABUALASAL, Omar ASBAYOU, Mahmoud GZAWI and Ramzi Abbès 

QALB 2015 Shared Task: CUFE Arabic Error Correction System
Michael Nawar 


Following the success of the First Shared Task on Automatic Arabic Error Correction in the Arabic NLP Workshop 2014 (WANLP-2014, EMNLP, Doha), we will conduct the Second Shared Task on Automatic Arabic Error Correction as part of WANLP-2015.  

Similar to the 2014 competition, the task relies on resources created under the Qatar Arabic Language Bank (QALB) project.  In addition to the correction of Arabic native text (news comments), the 2015 shared task will include an additional correction of non-native texts.

In order to participate, you need to use this link to register and receive the training data.  Also you need to subscribe to the QALB discussion group to receive the shared task notifications.

Participants are expected to also submit a short system description paper (4 pages + 2 for references).   

The following FAQ page describes the steps for participating in the shared task.  In summary, the shared task will use the following calendar:
  • February 15: Release of the initial training data
  • April 1: Final Release of the training data
  • April 30: Registration deadline
  • May 16: Test set available
  • May 28: Systems' outputs collected
  • June 3: System description paper deadline
  • June 10: Shared task results and answers keys to be announced
  • June 10: Shared task paper notification
  • June 21: Camera ready deadline for the system description paper
  • July 30: ACL 2015 Workshop in Beijing


Submission Types

There are two possible submissions types:

Full papers of maximum 8 pages + 2 for references
Shared Task System papers of maximum 4 pages + 2 for references AND authors must have participated in the shared task

Blind Reviewing Policy

The workshop follows a blind reviewing policy. The authors should omit their names and affiliations from the paper and avoid self-references that reveal their identity. Papers that do not conform to these requirements will be rejected without review. 

Submission Format

All submissions must be electronic in PDF and must be formatted using the ACL 2015 style files available at 

Submission Site

Papers should be submitted via the START Conference Manager at : 

Please do not send papers by email to the organizers. Such papers will not be considered. 

Multiple Submission Policy

Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Authors must inform organizers immediately once a paper is to be withdrawn from the workshop for any reason. Attempting to publish the same paper or with a large overlap (50%) may lead to rejection of the paper even after an acceptance notification have gone out. 


Program Co-chairs

Nizar Habash,  New York University Abu Dhabi
Stephan Vogel, Qatar Computing Research Institute
Kareem Darwish, Qatar Computing Research Institute

Publication Co-chairs
Nadi Tomeh, Paris 13 University
Houda Bouamor, Carnegie Mellon University Qatar

Publicity Chair
Wajdi Zaghouani, Carnegie Mellon University Qatar 

Shared Task Committee

Alla Rozovskaya (co-chair), Columbia University
Houda Bouamor (co-chair), Carnegie Mellon University in Qatar
Behrang Mohit,
Wajdi Zaghouani, Carnegie Mellon University Qatar 
Ossama Obeid, Carnegie Mellon University Qatar
Nizar Habash (advisor), New York University Abu Dhabi

Program Committee

Abdelmajid Ben-Hamadou, University of Sfax, Tunisia
Abdelsalam Nwesri, University of Tripoli, Libya
Achraf Chalabi , Microsoft Research, Egypt
Ahmed Ali, Qatar Computing Research Institute, Qatar
Ahmed El Kholy, Columbia University, USA  
Ahmed Rafea, The American University in Cairo, Egypt
Alberto Barrón Cedeño, Qatar Computing Research Institute, Qatar
Alexis Nasr, University of Marseille, France
Ali Farghaly, Monterey Peninsula College, USA
Almoataz B. Al-Said, Cairo University, Egypt
Aly Fahmy, Cairo University, Egypt
Azzeddine Mazroui, University Mohamed I, Morocco
Bassam Haddad, University of Petra, Jordan
Emad Mohamed, Suez Canal University, Egypt
Fransisco Guzman, Qatar Computing Research Institute, Qatar
Ghassan Mourad, Université Libanaise, Lebanon
Hamdy Mubarak, Qatar Computing Research Institute, Qatar
Hazem Hajj, American University of Beirut, Lebanon
Hend Alkhalifa, King Saud University, Saudi Arabia
Houda Bouamor, Carnegie Mellon University Qatar, Qatar
Imed Zitouni, Microsoft Research, USA
Joseph Dichy, Université Lyon 2, France
Kareem Darwish, Qatar Computing Research Institute, Qatar
Karim Bouzoubaa , Mohammad V University, Morocco
Kemal Oflazer, Carnegie Mellon University Qatar, Qatar
Khaled Shaalan, The British University in Dubai, UAE
Khaled Shaban, Qatar University, Qatar
Khalid Choukri, ELDA, European Language Resource Association, France
Lamia Hadrich Belguith, University of Sfax, Tunisia
Mohamed Elmahdy, Qatar University, Qatar
Mohamed Maamouri, Linguistic Data Consortium, USA
Mona Diab, George Washington University, USA
Mustafa Jarrar, Bir Zeit University, Palestine
Nada Ghneim, Higher Institute for Applied Sciences and Technology, Syria
Nadi Tomeh, University Paris 13, France
Nizar Habash, New York University Abu Dhabi, UAE
Otakar Smrž, Džám-e Džam Language Institute, Czech Republic
Owen Rambow, Columbia University, USA
Preslav Nakov, Qatar Computing Research Institute, Qatar
Ramy Eskander,  Columbia University, USA  
Salwa Hamada, Cairo University, Egypt
Samantha Wray, Qatar Computing Research Institute, Qatar
Shahram Khadivi, Tehran Polytechnic, Iran
Sherri Condon , The MITRE Corporation, USA
Stephan Vogel, Qatar Computing Research Institute, Qatar
Taha Zerrouki, University of Bouira, Algeria
Wael Salloum, Columbia University, USA  
Walid Magdy, Qatar Computing Research Institute, Qatar