|
Terms
|
Definition
|
| ANSI |
The
American National Standards Institute. The Microsoft collective
name for all Windows code pages. Sometimes used specifically for
code page 1252, which is a superset of ISO/IEC 8859-1. |
| ASCII |
Acronym
for American
Standard Code Interchange Information. It is a 7-bit character set
encoding that contains characters for unaccented letters a-z and
A-Z, most English punctuation marks, numbers, and a few control
characters. |
| BIDI |
Abbreviation
for bi-directionality - the process of mixing left-to-right and
right-to-left oriented text in a single line. |
| Case |
Feature
of certain alphabets where the letters have two distinct forms.
These variants, which may differ markedly in shape and size, are
called uppercase and lowercase letters. |
| Character |
The
smallest component of written language that has semantic value;
refers to the abstract meaning and/or shape, rather than a
specific shape, though in code tables some form of visual
representation is essential for the reader's understanding. |
| Character
Encoding Form |
The
mapping from a character set definition to the actual code units
used to represent the data. |
| Character
Set |
A
collection of elements used to represent textual information. |
| CJKV |
The
Chinese,
Japanese, Korean and Vietnamese languages - sometimes treated as a group. |
| Coded
Character Set |
A
character set in which each character is assigned a numeric code
point. Frequently abbreviated as character set, or code set. |
| Code
Page |
A
coded character set, often referring to a coded character set used
by a personal computer - e.g. PC code page 437, the default coded
character set used by the U.S. English version of the DOS
operating system. |
| DBCS |
Double
Byte Character Set. It is a character set required to represent
Asian languages, since most Asian languages consist of more
complex characters, which the Western alphabet is incapable of
representing. |
| Diacritic |
A
mark applied or attached to a symbol to create a new symbol that
represents a modified or new value e.g. adding an accent to a
character. Also called a diacritical mark or a diacritical. |
| DTP |
DeskTop
Publishing. The construction of the layout and the formatting of
any printed documents prior to the actual printing of such
documents. |
| FIGS |
French,
Italian, German and Spanish - sometimes treated as a group. |
| Font |
A
collection of glyphs used for the visual depiction of character
data. A font is often associated with a set of parameters (for
example size, posture, weight), which, when set to particular
values, generate a collection of imaginable glyphs. |
| Fuzzy
Matching |
A
method used in translation memory to match up a sentence
(currently translating) that is similar to a previously translated
sentence. The higher the percentage of the 'fuzzy match', the more
similar the sentences are. |
| Glyph |
The
graphical representation/shape of a character. A character can
have more than one glyph e.g. G and
G are the same character using
different glyphs. |
| HTML |
HyperText
Markup Language. A text description language related to SGML; it
mixes text format markup with plain text content to describe
formatted text. HTML is ubiquitous as the source language for the
Web pages on the Internet. |
| I18N |
Internationalisation
- this is the process of implementing a product so that it is not
specific to any one locale i.e. so that it can handle any language
for any locale without modifying any part of the design of the
product. This process must take place before a product can
effectively be localised. |
| L10N |
Localisation
- this is the process of taking a product and making it
linguistically and culturally appropriate for a specific locale. |
| Leverage |
This
is the process that takes elements from previous product
translations and automatically incorporates these into new
revisions. |
| Ligature |
A
glyph representing a combination of two or more characters. In
Latin script, there are only a few in modern use, such as the ligatures between "f" and "i" or between
"f" and "l". |
| Locale |
A
set of attributes specific to a language and geographical region,
e.g. date format, currency format etc. An example of a locale is:
English (U.S.). |
| Localisation |
The linguistic and cultural adaptation of digital content to the requirements of a foreign market and the provision of services and technologies for the management of multilingualism across the digital global information flow |
| Multibyte
Character Set |
A
character set encoded with a variable number of bytes per
character. Many large character sets have been defined as MBCS so
as to keep strict compatibility with the ASCII subset and/or
ISO/IEC 2022. Abbreviated as MBCS. |
| OLIF |
Open
Lexicon Interchange Format. It is a vehicle for exchanging
terminological and lexical data. |
| SBCS |
Acronym
for Single-Byte Character Set. Any 1-byte character encoding. This
term is generally used in contrast with DBCS and/or MBCS. |
| SGML |
Standard
Generalized Markup Language. It is called a metadata language and
is used for defining markup languages. |
| Script |
A
collection of symbols used to represent textual information in one
or more writing systems. |
| SGML |
Standard
Generalised Markup Language. A standard framework for defining
particular text markup languages. The SGML framework allows for
mixing structural tags that describe format with the plain text
content of documents, so that fancy text can be fully described in
a plain text stream of data. |
| Sim
Ship |
Simultaneous
Shipment. Release of all localised versions and the original
product at the same time. |
| Transcoding |
Conversion
of character data between different character sets. |
| Writing
System |
A
set of rules for using one or more scripts to write a particular
language. Examples include the American English writing system,
the French writing system and the Japanese writing system. |
| Unicode |
A
16-bit character set capable of encoding all known characters and
used as a worldwide character-encoding standard. |
| XLIFF |
XML
Localisation Interchange File Format. It is a XML specification
for multi-lingual data exchange. This standard is under the
supervision of Oasis. |
| XML |
eXtensible
Markup Language. It is called a metadata language - a language
describing other markup languages. It is a format used for
structuring documents and data on the web. XML is a simpler subset
of SGML. |
EURODICAUTOM
- The multilingual terminological database of the European Commission's
Translation Service. Monitored by Directorate General Information
Society, the database contains technical terms, abbreviations, acronyms
and phraseology. On April 1st 1999, it contained more than 5.5 million
entries.
LOGOS
- Multilingual E-Translation Portal
Microsoft
Localisation Glossary - Short glossary of localisation
terms.
Microsoft
Glossaries - Microsoft glossaries in many different
languages used in their applications.
Oracle
Localisation Glossary - Short glossary of localisation
terms (you need to register first and get a password).
Unicode
Glossary
The
Translator's Home Companion - Provides a set of
dictionaries and glossaries sorted by subject (see Computing).
Terminology
Collection - Online dictionaries and related links.
Acronym
Finder - A searchable database containing common acronyms
and abbreviations about all subjects, with a focus on computers,
technology, telecommunications, and the military.