home |  list all articles | search | subscribe to Localisation Focus | go to LRC homepage

 

Localisation in Ethiopia

Daniel Yacob outlines the importance of history to Ethiopia and how it is instrumental to future localisation challenges there


Originally published in December 2005 issue of Localisation Focus. To learn more about Localisation Focus, click here.

ONCE KNOWN AS THE HEART OF ABYSSINIA AND AS THE FABLED SOURCE OF THE NILE, Ethiopia today resembles an ancient society embracing all that information technology has to offer. Situated in the Horn of Africa, Ethiopia’s land mass is some 1,127,127km2 — just a little larger than France, Germany and Italy combined! The country’s capital, Addis Ababa, resides at an elevation of some 2,400m, amongst the highest cities in the world. Ethiopia’s topography divides into northern drier highlands and lusher southern lowlands, with the volcanically active Dallol Depression being the lowest point at -115m. Dallol also caries the distinction of having the world’s highest average annual temperature of 35ºC.

Spread across this tapestry, Ethiopia’s more than 70 million people present a mix of various ethnic groups living in both urban and rural communities. Since transitioning to a marketoriented economy in 1991 Ethiopia has seen steady economic growth, although the per capita income remains at an alarming $110 USD. The telecom sector continues to grow with internet service introduced in January of 1997 and mobile phone service following in early 1999, now with 10,000 and 600,000 subscribers respectively. While present landline teledensity is low at around 1%, demand is very high and the Ethiopian Telecom Authority has contracted with both Swedish-based Ericsson and Finland’s Nokia to expand capacity to 650,000 subscribers.

No longer a land of kingdoms and emperors, Ethiopia’s government — while forward thinking and democratic — has been slow to privatise the telecom sector. Although expected ‘very soon’ and surely will lead to an explosion in telecom use, small private telecom businesses have already managed to flourish in markets such as cyber cafes, mobile phone sales, and even custom- made personal computers.

Language Policy in Ethiopia
Ethiopia has a rich diversity of languages and language families. There are at least 80 local languages in Ethiopia grouped under the Ethio-Semitic, Omotic, Cushitic and Nilo-Saharan families. Amharic is the official national language though the federal government typically provides translation into the two other major languages of Afaan Oromo and Tigrinya.

Historically the minority languages have not fared well. The previous communist-ruled regime would only permit publication into languages for which it employed censors i.e., the three major languages. The new government is more sensitive to the language issue and any of nine semi-autonomous administrative regions may apply their own local language as a regional language. The federal government will then support the regional languages in various capacities such as for education materials. While the government may have good intent for linguistic pluralism, its capacity to support it is very limited. With the fear of imprisonment for publishing in one’s own language lifted, local language utilisation activity is gradually picking up pace.

13 Months of Sunshine?
For the outsider, the national tourism motto of ‘13 Months of Sunshine’ is an unexpected introduction into the world of Ethiopian localisation — there is a little more to it than language translation alone. History is very, very important to Ethiopia, which traces the foundation of its modern society back some 3,000 years to the time of King Solomon and the Queen of Sheba. To assure the success of any effort that would advance some part of Ethiopian society a step forward into the future, the first step must always be to look back into the past. The three millennia of accumulated traditions, and the axioms of culture that they revolve around, must be accommodated.

Many of Ethiopia’s traditions are inherited from the Ge’ez dynasties. This is particularly true of writing practices. ‘Ge’ez’ was the name of the people, who also spoke the language ‘Ge’ez’ and wrote in the two-millennia-old Ge’ez alphabet. Better known outside of Ethiopia as ‘Ethiopic’, the alphabet is technically an ‘open syllabary’, each letter representing both a consonant and vowel together. Ethiopic script also has 10 punctuation symbols, including a printed space symbol, , and twenty number symbols. The writing system introduces a few challenges for typesetting software.

Those languages that do use Latin script in Ethiopia will do so in their own way. That is, some digraphs and collation pairs are introduced, some phonetic values of the consonant letters will change, and new conventions are applied for the use of Latin vowels. While the writing convention with Latin script is systematic and consistent within a language, the convention may be incompatible with other Ethiopian languages that have also adopted the Latin script.

The number symbols of Ethiopic script are comprised of 9 digits and 9 ’tens’ values, a ‘hundred’ symbol and a ‘ten-thousand’ symbol. The numeral system itself is not a decimal system as with our familiar 0-9. There are some superficial similarities to the Roman numeral system, but on further inspection the Ethiopic numeral system is nowhere as complex. A conversion service between integers and Ethiopic numerals will be required to properly format dates, label page numbers and book chapters.

And what about those 13 months? That’s right; Ethiopia follows a form of the Coptic calendar. The Ethiopian calendar has 12 months of 30 days and a 13th month of 5 days on normal years, or 6 days on a leap year — which still occurs every 4 years, but in the year prior to when it occurs in the Gregorian calendar. The present year in Ethiopia is 1998 and it began on 11 September. As for the sunshine… in a place as big as Ethiopia, there will always be some corner of the country receiving at least a little sun on any given day.

Localisation Activity
Localisation in Ethiopia can be traced back to the early 1980s when hobbyists began to tinker with adding Ethiopic letters to their DOS and Macintosh computers. The first serious undertaking was at the National Computer and Information Center (NCIC) in Ethiopia where the DOS operating system and some applications were translated into Amharic. ‘Agafari’ (literally ‘server’) as it was known did not spread too far outside of the NCIC, most probably because there were not yet large numbers of computers to be found in the private sector. By the time computers were becoming more commonplace the Windows operating system had arrived and the earlier success with DOS was now behind the times. The NCIC did make a family of Agafari TrueType fonts and a keyboard but did not go further to attempt localising applications. The Agafari DOS fonts found new life in 1997 when they were contributed to Linux as console fonts. If you have a Linux system, you have this gift from the NCIC waiting for you on your computer.

More advanced localisation would have to wait for Ethiopic support in the Unicode standard which finally arrived in the year 2000 with the introduction of the 3.0 standard. In the five years since, locales have been developed for the Linux operating system for every language supported under the ISO-639 standard. The locale data has also become a part of the Unicode Consortium’s Common Locale Data Repository (CLDR). CLDR data can be used with IBM’s Internationalisation Components for Unicode (ICU) which recently has begun to support the Ethiopic calendar. The Google service and the GNOME window system have also been translated into Amharic. Seemingly paradoxical, while you can use Google with an Amharic interface, Google is not yet capable of searching for Amharic words!

This and other OpenSource-based localisation activity was conducted by the Ge’ez Frontier Foundation and was exploratory in nature, intended to root out and resolve problems still impeding software support for Ethiopian languages. More formal efforts have recently got under way at Ethiopian universities. The Computer Science department at the nation’s top technical institute, Addis Ababa University (AAU), is leading the way forward. This department has localised the OpenCMS content management system prepared for the regional administrative offices to manage web content on the ‘WoredaNet’. Handwriting recognition algorithms, which can identify Ethiopic letters with 99.4% accuracy, have been developed with an eye towards PDA support. Utilisation of WAP for Ethiopian languages was another project with a demonstration made for delivery of agricultural market information in Amharic. Perhaps the most highly anticipated project about to reach the public sector is the localisation of mobile phones to support Ethiopic SMS. A particular challenge here was to devise a keyboard system that would allow the 231 most common Ethiopic letters to be typed with the 12 keys on a phone keypad. A predictive text algorithm (T9-like) was also developed to facilitate entry.

Ongoing presently, AAU is working with Microsoft on application terminology translation. Similar to the Google approach, the project enlists public support via an online translation interface. Also notable and not to be forgotten, the Highlight Learning Center of Excellence (HiLCoE) in Addis Ababa has taken on localisation of the OpenOffice suite this year. In the ‘Cradle of Civilisation’ localisation is off to a promising, if limited, start. Given the limited resources that can be put into localisation for Ethiopia; the biggest challenge of all will be for public volunteers, academia, government and industry to work together to make ‘L x N’ a sustainable, and ultimately successful effort for its beneficiaries.

Daniel Yacob began his journey into Ethiopic localisation and natural language processing in 1992. Daniel is now the director of The Ge’ez Frontier Foundation, a charitable organisation dedicated to the advancement of Ethiopic and Ethiopian languages in electronic environs. He can be reached at yacob@geezNOSPAM.org (remove NOSPAM to email).

 

 

Copyright Localisation Research Centre . All rights reserved.