The first day of Eid al-Fitir 2014—the feast that marks the end of the month of Ramadan—seems an auspicious day to celebrate Arab diversity. Mo3jam (معجم) is a user-generated dictionary of colloquial Arabic (the word itself means lexicon or dictionary). It was created in 2009 by Saudi developer Abdullah Arif to document the local expressions spoken by the some 290 million speakers of Arabic in more than twenty countries.

Mo3jam turns to the crowd to explore, share and evaluate meanings in the various colloquial Arabics. The interface is in both Arabic and English. Crowd-generated entries include tags for the regional dialect(s) using the term. Depending on the author, they may also include transcription in Latin letters, a definition in English or French and a sentence or passage written in Arabic script illustrating how the expression can be used. When adding a new term, Mo3jam has built in the functionality of Yamli, allowing you to type Arabic without an Arabic keyboard. Mo3jam gives all users, registered or not, the opportunity to vote on each entry, since in the words of creator Arif “quality should derive from consensus.”

At the time this was written, Mo3jam seemed to contain some 5500 expressions. In terms of pure numbers, the two communities that seem most active on the site are the Saudis and the Moroccans, with the Egyptians, Yemenis and Algerians trailing close behind. Arabic linguists have increasingly turned to social media for their data. A quick search for “Arabic” in the papers from the Language Resources and Evaluation Conference held in Iceland in May 2014 shows how the natural language processing community is making considerable strides in mining Twitter, Facebook and Youtube.

Who will start using Mo3jam next?


Edited by: David J. Wrisley

Text by: David J. Wrisley