 |
Localisation in Ethiopia
Daniel Yacob outlines the importance of history
to Ethiopia and how it is instrumental to
future localisation challenges there
Originally published in December 2005 issue of Localisation Focus. To learn more about Localisation Focus, click here.
ONCE KNOWN AS THE HEART OF ABYSSINIA
AND AS THE FABLED SOURCE OF THE NILE,
Ethiopia today resembles an ancient society embracing
all that information technology has to offer. Situated in the
Horn of Africa, Ethiopia’s land mass is some 1,127,127km2 —
just a little larger than France, Germany and Italy combined!
The country’s capital, Addis Ababa, resides at an elevation of
some 2,400m, amongst the highest cities in the world. Ethiopia’s
topography divides into northern drier highlands and lusher
southern lowlands, with the volcanically active Dallol
Depression being the lowest point at -115m. Dallol also caries
the distinction of having the world’s highest average annual temperature
of 35ºC.
Spread across this tapestry, Ethiopia’s more than 70 million
people present a mix of various ethnic groups living in both
urban and rural communities. Since transitioning to a marketoriented
economy in 1991 Ethiopia has seen steady economic
growth, although the per capita income remains at an alarming
$110 USD. The telecom sector continues to grow with internet
service introduced in January of 1997 and mobile phone service
following in early 1999, now with 10,000 and 600,000 subscribers
respectively. While present landline teledensity is low at
around 1%, demand is very high and the Ethiopian Telecom
Authority has contracted with both Swedish-based Ericsson and
Finland’s Nokia to expand capacity to 650,000 subscribers.
No longer a land of kingdoms and emperors, Ethiopia’s government
— while forward thinking and democratic — has been
slow to privatise the telecom sector. Although expected ‘very
soon’ and surely will lead to an explosion in telecom use, small
private telecom businesses have already managed to flourish in
markets such as cyber cafes, mobile phone sales, and even custom-
made personal computers.
Language Policy in Ethiopia
Ethiopia has a rich diversity of languages and language families.
There are at least 80 local languages in Ethiopia grouped under
the Ethio-Semitic, Omotic, Cushitic and Nilo-Saharan families.
Amharic is the official national language though the federal government
typically provides translation into the two other major
languages of Afaan Oromo and Tigrinya.
Historically the minority languages have not fared well. The
previous communist-ruled regime would only permit publication
into languages for which it employed censors i.e., the three
major languages. The new government is more sensitive to the
language issue and any of nine semi-autonomous administrative
regions may apply their own local language as a regional language.
The federal government will then support the regional
languages in various capacities such as for education materials.
While the government may have good intent for linguistic pluralism,
its capacity to support it is very limited. With the fear of
imprisonment for publishing in one’s own language lifted, local
language utilisation activity is gradually picking up pace.
13 Months of Sunshine?
For the outsider, the national tourism motto of ‘13 Months of
Sunshine’ is an unexpected introduction into the world of
Ethiopian localisation — there is a little more to it than language
translation alone. History is very, very important to Ethiopia,
which traces the foundation of its modern society back some
3,000 years to the time of King Solomon and the Queen of
Sheba. To assure the success of any effort that would advance
some part of Ethiopian society a step forward into the future,
the first step must always be to look back into the past. The
three millennia of accumulated traditions, and the axioms of
culture that they revolve around, must be accommodated.
Many of Ethiopia’s traditions are inherited from the Ge’ez
dynasties. This is particularly true of writing practices. ‘Ge’ez’
was the name of the people, who also spoke the language ‘Ge’ez’
and wrote in the two-millennia-old Ge’ez alphabet. Better
known outside of Ethiopia as ‘Ethiopic’, the alphabet is technically
an ‘open syllabary’, each letter representing both a consonant
and vowel together. Ethiopic script also has 10 punctuation
symbols, including a printed space symbol, , and twenty number
symbols. The writing system introduces a few challenges for
typesetting software.
Those languages that do use Latin script in Ethiopia will do so
in their own way. That is, some digraphs and collation pairs are
introduced, some phonetic values of the consonant letters will
change, and new conventions are applied for the use of Latin
vowels. While the writing convention with Latin script is systematic
and consistent within a language, the convention may be
incompatible with other Ethiopian languages that have also
adopted the Latin script.
The number symbols of Ethiopic script are comprised of 9
digits and 9 ’tens’ values, a ‘hundred’ symbol and a ‘ten-thousand’
symbol. The numeral system itself is not a decimal system
as with our familiar 0-9. There are some superficial similarities
to the Roman numeral system, but on further inspection the
Ethiopic numeral system is nowhere as complex. A conversion
service between integers and Ethiopic numerals will be required
to properly format dates, label page numbers and book chapters.
And what about those 13 months? That’s right; Ethiopia follows
a form of the Coptic calendar. The Ethiopian calendar has
12 months of 30 days and a 13th month of 5 days on normal
years, or 6 days on a leap year — which still occurs every 4
years, but in the year prior to when it occurs in the Gregorian
calendar. The present year in Ethiopia is 1998 and it began on
11 September. As for the sunshine… in a place as big as
Ethiopia, there will always be some corner of the country receiving
at least a little sun on any given day.
Localisation Activity
Localisation in Ethiopia can be traced
back to the early 1980s when hobbyists
began to tinker with adding
Ethiopic letters to their DOS and
Macintosh computers. The first serious
undertaking was at the National
Computer and Information Center
(NCIC) in Ethiopia where the DOS
operating system and some applications
were translated into Amharic.
‘Agafari’ (literally ‘server’) as it was
known did not spread too far outside
of the NCIC, most probably because
there were not yet large numbers of
computers to be found in the private
sector. By the time computers were
becoming more commonplace the
Windows operating system had
arrived and the earlier success with
DOS was now behind the times. The
NCIC did make a family of Agafari TrueType fonts and a keyboard
but did not go further to attempt localising applications.
The Agafari DOS fonts found new life in 1997 when they were
contributed to Linux as console fonts. If you have a Linux system,
you have this gift from the NCIC waiting for you on your
computer.
More advanced localisation would have to wait for Ethiopic
support in the Unicode standard which finally arrived in the
year 2000 with the introduction of the 3.0 standard. In the five
years since, locales have been developed for the Linux operating
system for every language supported under the ISO-639 standard.
The locale data has also become a part of the Unicode
Consortium’s Common Locale Data Repository (CLDR).
CLDR data can be used with IBM’s Internationalisation
Components for Unicode (ICU) which recently has begun to
support the Ethiopic calendar. The Google service and the
GNOME window system have also been translated into
Amharic. Seemingly paradoxical, while you can use Google with
an Amharic interface, Google is not yet capable of searching for
Amharic words!
This and other OpenSource-based localisation activity was
conducted by the Ge’ez Frontier Foundation and was exploratory
in nature, intended to root out and resolve problems still impeding
software support for Ethiopian languages. More formal
efforts have recently got under way at Ethiopian universities. The
Computer Science department at the nation’s top technical institute,
Addis Ababa University (AAU), is leading the way forward.
This department has localised the OpenCMS content management
system prepared for the regional administrative offices to
manage web content on the ‘WoredaNet’. Handwriting recognition
algorithms, which can identify Ethiopic letters with 99.4%
accuracy, have been developed with an eye towards PDA support.
Utilisation of WAP for Ethiopian languages was another
project with a demonstration made for delivery of agricultural
market information in Amharic. Perhaps the most highly anticipated
project about to reach the public sector is the localisation
of mobile phones to support Ethiopic SMS. A particular challenge
here was to devise a keyboard system that would allow the
231 most common Ethiopic letters to be typed with the 12 keys
on a phone keypad. A predictive text algorithm (T9-like) was
also developed to facilitate entry.
Ongoing presently, AAU is working with Microsoft on application
terminology translation. Similar to the Google approach,
the project enlists public support via an online translation interface.
Also notable and not to be forgotten, the Highlight
Learning Center of Excellence (HiLCoE) in Addis Ababa has
taken on localisation of the OpenOffice suite this year.
In the ‘Cradle of Civilisation’ localisation is off to a promising,
if limited, start. Given the limited resources that can be put
into localisation for Ethiopia; the biggest challenge of all will be
for public volunteers, academia, government and industry to
work together to make ‘L x N’ a sustainable, and ultimately successful
effort for its beneficiaries.
Daniel Yacob began his journey into Ethiopic localisation and
natural language processing in 1992. Daniel is now the director
of The Ge’ez Frontier Foundation, a charitable organisation
dedicated to the advancement of Ethiopic and Ethiopian languages
in electronic environs. He can be reached at
yacob@geezNOSPAM.org (remove NOSPAM to email).
|
 |
 |
|