Home About LRC LRC News Resources
 

LRC XIII
Localisation4All

The 13th Annual Internationalisation and Localisation Conference 
organised by the Localisation Research Centre (LRC)
with the Centre for Next Generation Localisation (CNGL)

02-03 October 2008
Marino Institute of Education, 
Dublin, Ireland

Supported by

 

Where to stay            Keynote             Conference Home             How to get there            Fees

Conference Talks




Using community feedback to improve social networking terminology in Microsoft products

Sara Nicolini & Palle Petersen - Microsoft

Unlike more traditional software like operating systems and office productivity suites, social networking technology and therefore terminology has developed rapidly over the last few years.

With more Web 2.0 applications featuring a strong social aspect, including the voice of the customer is becoming increasingly important in order to be able to capture the latest terminology used in the social networking domain. To be successful in tomorrow’s market-place, established businesses need to create business models that are inclusive of its customers while leveraging the global expertise and vast know-how and future potential they bring to the table.

To address this challenge, Microsoft has launched several initiatives to embrace end-users and the “community”. One of them is the “MTCF” terminology community engagement and feedback program designed to assess and improve the quality of localised Microsoft Messenger and Spaces terminology through community feedback with a focus on social networking terminology. 
This presentation will cover lessons learned from the 1900+ terminology suggestions received across 29 EMEA languages during this feedback program. It will explore interesting observations from the community around existing terminology, implications for source terminology, the importance of style and “artistic license” in translation and challenges to existing - and often anecdotal - assumptions about terminology quality.

Sarawak Language Technology (SaLT) Initiative: The preservation of Sarawak Ethnic Languages

Alvin Yeo - Universiti Malaysia Sarawak

According to Ethnologue, 84% of the 46 living languages in Sarawak have less than 10,000 speakers. The number of speakers of these languages is also decreasing due to the influence of major languages such as Bahasa Malaysia and English especially in the urban areas, and exogamy. In addition, to the authors’ knowledge, there does not exist any applications which involve Sarawak languages. For example, there are no Iban electronic dictionaries, nor (computer supported) language translators which can help in speeding up the translation of materials into the target Sarawak ethnic languages, or vice-versa. 

Thus, the Sarawak Language Technologies (SaLT) Research Group at Universiti Malaysia Sarawak has initiated a number of projects with the end goal of revitalising and maintaining the ethnic languages of Sarawak. The ongoing projects include building corpora of languages (Iban, Melanau and Kelabit), as well as, research and development of technologies which contribute to the implementation of software for the ethnic languages. Specifically, these projects include development of morphological analysers and POS taggers which contribute to work on Iban-English translation, and in human computer interaction using Melanau speech and text. Other projects in the pipeline include a wiki approach in building a Bidayuh lexicon, and a web-based Sarawak Malay language dictionary. These projects would not have been possible without the collaboration of partners such as Tun Jugah Foundation, Dewan Bahasa dan Pustaka (Sarawak Branch), and Pustaka Negeri Sarawak, and national funding from the Malaysian Ministry of Science, Technology and Innovation. The final paper will provide more detailed information of the above projects.

Process Automation at LSPs – It Ain’t Just About the Tools

Dr. David Filip - Moravia

As overseer of internal change projects at Moravia, David Filip knows that while Localisation Service Providers (LSPs) work with projects every day for our clients, we don’t perhaps really understand what it takes to specify, execute and monitor the success (or not) of a change in how things are done – especially if this change gets the internal label as “project”. LSPs need to apply professional project management approaches inside our own houses. We can automate things, but, it’s not all about the technology.

Systemic localisation validation across languages 

Martin Ørsted - Microsoft 

Microsoft has internally designed a scripting tool that allows us to ensure that known localisation issues can be systematically caught and fixed across all languages, so we can create a scenario where we only ever have to find an issue once, in what ever given language, and we can then through the use of localisation verification comments for that resource ensure that the issue gets caught on all languages where it may occur. Introducing this as a practice can allow us to cut dramatically down on certain forms of testing and make use of orthogonal arrays across languages and pseudo localisation much more efficient.

Open standards in use in localisation - an engineering approach

Andres Vega - Tek Translations

This session will provide an overview of existing open standards within content handling and localisation from a technical and engineering perspective. For some of these standards, attendees will learn about the advantages of these standards both general and specific to localisation as well as their implementation challenges and issues. Practical and hands-on examples will illustrate how these can be beneficial in a real production scenario.

The areas covered will include:

Unicode, it's impact solving cross-script and cross-platform issues, conversion issues to and from existing legacy encodings. Attendees will learn about Unicode’s current implementation status in Localisation, DTP and Multimedia tools. Some other topics which will be briefly considered include: OpenType fonts capabilities and Gaiji characters.
XML, it’s impact in localisation as an open and extensible technology for content handling and data inter-exchange. Examples of localisation and segmentation related issues will be provided.
Content Management Systems, We will learn about their impact on authoring, updates, content validation and DTP and their localisation challenges, practical examples will help illustrate these.

Localisation specific standards will also be discussed, with special focus on XLIFF, TMX, and TBX, and a brief look into SRX.

Checking Terminology Consistency with Statistical Methods

Alfredo Maldonado Guerra & Masaki Itagaki - Microsoft

Work in Statistical Language Technologies has uncovered numerous techniques that can be used in Software Localisation. This paper explores the application of statistical methods in the automatic validation of terminology consistency in localised software. 

It sets out a statistical algorithm that identifies the translation of a given source term in a software localisation project, and then determines whether the translation has been used consistently within that project. However, the accuracy of this algorithm depends on the size of the linguistic data made available to it (the bigger, the better), and since the typical software project is small by traditional statistical NLP standards, we need to find a way to compensate for this lack of data. 

At the same time, the algorithm needs to deal with the different grammatical features of each target language we work with at Microsoft. To address these issues the authors chose a hybrid approach of statistical analysis with a minimal grammatical model. We discuss the statistical analysis applied and show what is the minimum amount of linguistic knowledge needed by the model in order to successfully deal with these issues.

Linguistic Comparison and Analysis of Statistical Post-Editing between Chinese and Japanese

Midori Tatsumi & Yanli Sun - Symantec/Dublin City University

Statistical post-editing (SPE) method has been increasingly getting attentions. While this technique can improve greatly the quality of some of the machine translation (MT) output, it sometimes makes unwanted and inappropriate changes. In addition, some problems in MT output can not easily be addressed by SPE technique. This paper analyses the results of a recent experimental SPE conducted by Systran on English to Chinese and Japanese translation using the data provided by Symantec. The improvements and degradations made by SPE will be compared both quantitatively and qualitatively, focusing on the similarities and differences of advantages and disadvantages of SPE between two languages. It also investigates whether SPE will reduce the task of human post-editing that may be conducted after the SPE process on the text for publishing purposes. The first part of this paper introduces related theoretical and research background on combination of Rule-Base Machine Translation (RBMT) and SPE. The methodology of the current research will be briefed in the second part. Detailed comparison, analysis and discussion will then be presented in the third part of the paper. Finally, a short conclusion concludes this research and points out future work.

Automation of Terminology tasks using T-Manager

Rafael Guzman - Symantec

Good-quality and controlled terminology is critical for the success of rule-based Machine Translation (MT). This typically involved tasks such as leveraging, checking for duplicates and deprecated terms, customisation, glossaries alignment, and generating metrics reports.  These tasks depend heavily on terminology comparisons, which require specific criteria. Unfortunately, these criteria are often underestimated and even neglected. When this happens, unexpected results and hidden issues occur. Doing these tasks manually is tedious, time-consuming, expensive and prone to errors. This presentation will provide an overview of necessary terminology comparison criteria as well as a demo that will show how many of these tasks can be automated with T-Manager Terminology Tool.

 

The Localisation Research Centre (LRC) provides a comprehensive information service to the localisation industry; conducts research and development in localisation and related areas, such as language engineering; produces a range of publications including Localisation Focus - the International Journal of Localisation; and oversees a number of education and training programs.

 
© Copyright 2008 Localisation Research Centre (LRC). All rights reserved.