Best Thesis - 2013

Lossless Exchange of Data: Limitations of Localisation Data Exchange Standards and their Effect on Interoperability

-Ruwan Asanka Wasala


Localisation is a relatively new field, one that is constantly changing with the latest technology to offer services that are cheaper, faster and of higher quality. As the complexity of localisation projects increases, interoperability between different localisation tools and technologies becomes ever more important. However, lack of interoperability is a significant problem in localisation. One important aspect of interoperability in localisation is the lossless exchange of data between different technologies and different stages of the overall process.

Standards play a key role in enabling interoperability. Several standards exist that address data-exchange interoperability. The XML Localisation Interchange File Format (XLIFF) has been developed by a Technical Committee of Organization for the Advancement of Structured Information Standards (OASIS) and is an important standard for enabling interoperability in the localisation domain. It aims to enable the lossless exchange of localisation data and metadata between different technologies. With increased adoption, XLIFF is maturing. Solutions to significant issues relating to the current version of the XLIFF standard and its adoption are being proposed in the context of the development of XLIFF version 2. Important matters for a successful adoption of the standard, such as standard-compliance, conformance and interoperability of technologies claiming to support the standard have not yet been adequately addressed. In this research, we aim to fill this gap by focusing on the identification of limitations of the XLIFF standard and the implementations which are leading to interoperability issues.

First, we conducted a pilot study to gain an in-depth understanding of the features of the localisation data-exchange standards. The pilot study involved a systematic comparison of XLIFF and Localisation Content Exchange, Microsoft’s internal localisation data-exchange format. Having gained a better understanding of the features of localisation data-exchange standards, in our main experiment we focused on identifying the limitations of XLIFF and its implementations. For this purpose we have constructed the first large corpus of XLIFF files that has enabled us to perform various statistical  analyses on the real usage of features of the XLIFF specification. From this research, we could not only identify the limitations that are leading to interoperability issues, but also important features of the standard for successful syntactic and semantic interoperability. In parallel to our main experiment, we designed and created a prototype based on service oriented architecture to investigate interoperability issues among distributed localisation tools and technologies used in the localisation process.

As the main contribution of our thesis, we propose a framework called XML - Data Interoperability Usage Evaluation (XML-DIUE), a systematic approach based on our main experiment, to the identification of limitations of data-exchange standards and implementations. The framework accomplishes this by helping the standard developers to empirically identify the most important elements, the less important elements, the problematic areas, usage patterns, associated tools, and so on. The proposed framework provides a practical approach for improving interoperability of data-exchange formats.