Home About LRC LRC News Resources

Other Online Glossaries

Localisation Glossary

Terms

Definition

ANSI The American National Standards Institute. The Microsoft collective name for all Windows code pages. Sometimes used specifically for code page 1252, which is a superset of ISO/IEC 8859-1.
ASCII Acronym for American Standard Code Interchange Information. It is a 7-bit character set encoding that contains characters for unaccented letters a-z and A-Z, most English punctuation marks, numbers, and a few control characters.
BIDI Abbreviation for bi-directionality - the process of mixing left-to-right and right-to-left oriented text in a single line.
Case Feature of certain alphabets where the letters have two distinct forms. These variants, which may differ markedly in shape and size, are called uppercase and lowercase letters.
Character The smallest component of written language that has semantic value; refers to the abstract meaning and/or shape, rather than a specific shape, though in code tables some form of visual representation is essential for the reader's understanding.
Character Encoding Form The mapping from a character set definition to the actual code units used to represent the data.
Character Set A collection of elements used to represent textual information.
CJKV The Chinese, Japanese, Korean and Vietnamese languages - sometimes treated as a group.
Coded Character Set A character set in which each character is assigned a numeric code point. Frequently abbreviated as character set, or code set.
Code Page A coded character set, often referring to a coded character set used by a personal computer - e.g. PC code page 437, the default coded character set used by the U.S. English version of the DOS operating system.
DBCS Double Byte Character Set. It is a character set required to represent Asian languages, since most Asian languages consist of more complex characters, which the Western alphabet is incapable of representing.
Diacritic A mark applied or attached to a symbol to create a new symbol that represents a modified or new value e.g. adding an accent to a character. Also called a diacritical mark or a diacritical.
DTP DeskTop Publishing. The construction of the layout and the formatting of any printed documents prior to the actual printing of such documents.
FIGS French, Italian, German and Spanish - sometimes treated as a group.
Font A collection of glyphs used for the visual depiction of character data. A font is often associated with a set of parameters (for example size, posture, weight), which, when set to particular values, generate a collection of imaginable glyphs.
Fuzzy Matching A method used in translation memory to match up a sentence (currently translating) that is similar to a previously translated sentence. The higher the percentage of the 'fuzzy match', the more similar the sentences are.
Glyph The graphical representation/shape of a character. A character can have more than one glyph e.g. G and G are the same character using different glyphs.
HTML HyperText Markup Language. A text description language related to SGML; it mixes text format markup with plain text content to describe formatted text. HTML is ubiquitous as the source language for the Web pages on the Internet. 
I18N Internationalisation - this is the process of implementing a product so that it is not specific to any one locale i.e. so that it can handle any language for any locale without modifying any part of the design of the product. This process must take place before a product can effectively be localised.
L10N Localisation - this is the process of taking a product and making it linguistically and culturally appropriate for a specific locale.
Leverage This is the process that takes elements from previous  product translations and automatically incorporates these into new revisions.
Ligature A glyph representing a combination of two or more characters. In Latin script, there are only a few in modern use, such as the ligatures between "f" and "i" or between "f" and "l".
Locale A set of attributes specific to a language and geographical region, e.g. date format, currency format etc. An example of a locale is: English (U.S.).
Localisation The linguistic and cultural adaptation of digital content to the requirements of a foreign market and the provision of services and technologies for the management of multilingualism across the digital global information flow
Multibyte Character Set A character set encoded with a variable number of bytes per character. Many large character sets have been defined as MBCS so as to keep strict compatibility with the ASCII subset and/or ISO/IEC 2022. Abbreviated as MBCS.
OLIF Open Lexicon Interchange Format. It is a vehicle for exchanging terminological and lexical data.
SBCS Acronym for Single-Byte Character Set. Any 1-byte character encoding. This term is generally used in contrast with DBCS and/or MBCS.
SGML Standard Generalized Markup Language. It is called a metadata language and is used for defining markup languages.
Script A collection of symbols used to represent textual information in one or more writing systems.
SGML Standard Generalised Markup Language. A standard framework for defining particular text markup languages. The SGML framework allows for mixing structural tags that describe format with the plain text content of documents, so that fancy text can be fully described in a plain text stream of data.
Sim Ship Simultaneous Shipment. Release of all localised versions and the original product at the same time.
Transcoding Conversion of character data between different character sets.
Writing System A set of rules for using one or more scripts to write a particular language. Examples include the American English writing system, the French writing system and the Japanese writing system.
Unicode A 16-bit character set capable of encoding all known characters and used as a worldwide character-encoding standard.
XLIFF XML Localisation Interchange File Format. It is a XML specification for multi-lingual data exchange. This standard is under the supervision of Oasis.
XML eXtensible Markup Language. It is called a metadata language - a language describing other markup languages. It is a format used for structuring documents and data on the web. XML is a simpler subset of SGML.

 

Other Glossaries Available Online

EURODICAUTOM - The multilingual terminological database of the European Commission's Translation Service. Monitored by Directorate General Information Society, the database contains technical terms, abbreviations, acronyms and phraseology. On April 1st 1999, it contained more than 5.5 million entries.

LOGOS - Multilingual E-Translation Portal

Microsoft Localisation Glossary - Short glossary of localisation terms.

Microsoft Glossaries - Microsoft glossaries in many different languages used in their applications.

Oracle Localisation Glossary - Short glossary of localisation terms (you need to register first and get a password).

Unicode Glossary

The Translator's Home Companion - Provides a set of dictionaries and glossaries sorted by subject (see Computing).

Terminology Collection - Online dictionaries and related links.

Acronym Finder - A searchable database containing common acronyms and abbreviations about all subjects, with a focus on computers, technology, telecommunications, and the military.

 

© Copyright 2007 Localisation Research Centre (LRC). All rights reserved.