The Living Arabic Project: An Arabic Education Tool
Creating the first online, multi-dialect Arabic dictionary that focuses on practical usage and can be searched in Arabic and English
The Living Arabic Project is the first multi-dialect, online Arabic dictionary, improving how people learn the Arabic language by changing how we perceive and interact with it. But for readers who are unfamiliar with Arabic, the question that you’re probably asking is, why is that important? And for those who have studied Arabic, how do you know this will actually work?
Let’s start with the basics. Arabic is one of the most difficult languages to learn for non-native speakers. The US Department of State rates it as one of the most challenging languages for native English speakers, on par with Mandarin, Japanese, and Korean.
The complex grammar, multitude of dialects, and lack of good learning resources are the most commonly cited problems. Even students who graduate with a bachelor’s degree in Arabic studies often don’t have a basic working proficiency in Arabic. However, all these challenges to learning Arabic stem from what linguists call diglossia, which means there are essentially two languages operating in parallel.
Students of Arabic, therefore, have to learn at least two distinct languages that are used side-by-side everyday. They must learn so-called Classical Arabic (CA) or Modern Standard Arabic (MSA) and at least one dialect. MSA has become standard across the Arab countries and is based on the dialect spoken 1400 years ago in what is now western Saudi Arabia. It is generally used in writing, media, and formal situations such as political and academic speeches. However, in everyday interactions like when interacting with friends and family and in online chat forums, the local dialect is used, which varies by region.
Dialects have a grammatical structure, pronunciation, and vocabulary distinct from MSA. In Egypt, the local dialect draws from Coptic, Syriac, Italian, French, as well as MSA. Just imagine speaking one language at home and in day-to-day interactions, but using a different language when engaging in media, writing, and formal situations. An Arabic student focusing on Egypt needs to learn both MSA and the Egyptian dialect to be able to operate.
There have been a number of efforts to “overcome” diglossia, such as new textbooks, archaic linguistic texts explaining dialectology, and an array of dictionaries. While these help, the academic and two-dimensional organizational structure of printed text fails to address the challenge of diglossia across the Arab world. Arabs themselves have proposed modernizing the language by simplifying the grammar or doing away with either the dialects or MSA. None of these are practical--or even possible--to implement across the Arab countries.
To make learning Arabic effective requires more than just a new textbook; you don’t need to change the language to do it. It needs a tool that changes how people perceive Arabic.
This is where the Living Arabic Project comes in. It utilizes Arabic’s root-word structure to link together multiple dialect dictionaries with MSA. Instead of seeing diglossia simply as a challenge to be overcome or obstacle to be done away with, it uses diglossia to show the richness inherent in the Arabic language.
The core product is a unique online Arabic dictionary. Three features differentiate this dictionary from other online Arabic dictionaries:
- It is the only online Arabic dictionary that has multiple dialect dictionaries available to users, and is the first and only Arabic dictionary that cross references different dialects. No matter what Arab country they are in, users can find the data they need. Few resources for Arabic dialects exist online (usually just short word lists that are pasted into an online forum). The Living Arabic Project, on the other hand, lets users translate colloquial poetry and music, online chat forums and social media, local movies and TV shows, and other areas where dialects are preferred.
- All the data is entered by hand and is based on practical usages of the language. The database utilizes transcribed movies and TV shows, translated poems and songs in colloquial, and recorded phrases from conversations, all to show real usages of the language. It also draws from textbooks, other dictionaries, and academic resources to gather additional examples. Notes on usage, synonyms and antonyms, and comparisons between words, help users understand when one word is more appropriate than another. New words and phrases are regularly added to the dictionary, creating a rich and practical data set.
- It has three search features: users can search by Arabic word, Arabic root, or English word. The Arabic root search allows users to contextualize words in relation to others derived from the same root. The multiple search features help users become producers of the language, giving them the ability to speak and write in Arabic, instead of just consumers who can only read and listen.
Comparing this to other resources is illustrative. Most online resources for Arabic, such as Ejtaal’s Arabic Almanac, Aratools, and Almaany.com, are excellent resources for MSA--but only for MSA. Many online resources rely on pre-made data sets and algorithms, making it hard for users to discern the proper context for the definition found. Google Translate, for instance, only works with MSA and its translation algorithm draws predominantly from United Nations documents, making it hard for Google to understand things outside of the UN style; it can’t even touch dialects. The results of a 2011 study scored Google’s Arabic-English translation at 34/100. Printed texts could never cover a multi-dialect dictionary; it would be like searching through two dictionaries at the same time, with the hope that they line up. The Living Arabic Project’s unique data structure ensures accessibility.
Why is Funding Needed?
The funding ask is $50,000 and is needed to complete the data entry for the Levantine and MSA dictionaries. I have been working on this project for eight years, during which I completed the Egyptian dialect dictionary and used it as a pilot project to work on the database structure and code. With funding, the MSA and Levantine dialect dictionary (the latter covering the Palestinian, Lebanese, and Syrian dialects) can be completed in 12 months. Without funding, it will take me years on my own.
Furthermore, the funding will be used to pay Syrian refugees to work on the Levantine and MSA dictionaries. The Syrian refugee crisis is the largest the world has witnessed since World War II, and there are many skilled Syrians who are in need of work. In the long term, the Living Arabic Project will also help them by giving them an online dictionary to preserve their dialect for them and their children.
This is a passion project for me, and I will be working on it the rest of my life. The long term goal is to complete dictionaries for the seven main dialect groups (Levantine, Iraqi, Gulf, North African, Egyptian, and Yemeni) and one for MSA. But completing the Levantine and MSA dictionaries, along with the already completed Egyptian dictionary, will achieve two goals. First, it will cover the most commonly taught forms of Arabic (MSA, Egyptian, and Levantine). Second, it will give me enough data to start creating educational applications. Any funding beyond the ask amount will go toward completing the other dialect dictionaries and creating their respective applications.
Evidence suggests that this is already having an impact on Arabic education. The site is live and available to everyone. Hundreds of searches are conducted every day. Educators, professional translators, and students have complimented the site and extolled the virtues of a multi-dimensional Arabic dictionary. With the basic code now completed, the data structure mapped out, and the basic resources collected, the only thing holding the project back is the data entry.
This project won’t end after this Kickstarter. As the name suggests, it is a living project designed to evolve with the language. What has been completed thus far has been a laborious process that started in 2008 and has grown over the years. It’s a lifelong passion, but a worthy one because it can change how we perceive a language. Behind that language are millions of people who speak it every day. If we can change how we learn the language, we can improve how we connect with people.
Risks and challenges
The biggest risk is that I will fall behind schedule. I’ve estimated the data entry will take 12 months based on my rate entering words in the past and after speaking to the people who would be doing the data entry, to see their schedules and test their accuracy. Nonetheless, unexpected events could hamper the process, to which I can promise that, even if the project is delayed, it will still be completed. I’ve been working on it since 2008 and have no intention of stopping.
At this point, the basic code has been developed and tested, and proof of its functionality is available for anyone to test at www.livingarabic.com. So there is no risk that the product won't work.
I pay for the site, www.livingarabic.com, out of my own funds, and will continue to do so, without worry that the site will go down.Learn about accountability on Kickstarter
- (38 days)