TRST
20009 - The Multilingual Information Lifecycle
Student
webpages
Alignment
page
Tag editor page
Weeks One and Two
- Translation is "an
interlinguistic transfer procedure comprising the interpretation of the
sense of a source text and the production of a target text with the
intent of establishing a relationship of equivalence between the two
texts, while at the same time observing both the inherent communication
parameters and the constraints imposed on the translator." (From: Terminologie de la traduction -
Translation Terminology, p. 188)
- Globalization (G11N), internationalization (I18N), and localization (L10N) are defined and
explained at the Localisation Industry Standards Association (LISA)
website under "About Globalisation".
- Locale is "a set of
parameters that defines the user's language, country and any special
variant preferences that the user wants to see in their user interface.
Usually a locale identifier consists of at least a language identifier
and a region identifier." (Wikipedia: Locale).
See also ISO 3166-2 for two-letter country codes and ISO 639-1 for two- and three-letter language codes.
- Character
encoding "consists of a code that pairs each character from
a given repertoire with something else, such as a sequence of natural
numbers, octets or electrical pulses, in order to facilitate the
transmission of data (generally numbers and/or text) through
telecommunication networks or storage of text in computers. Other terms
like character encoding, character set (charset),
and sometimes character map or code
page are used almost interchangeably." (Wikipedia: Character
encoding)
- "In typography, a font (also fount) is traditionally
defined as a quantity of sorts composing a complete character set of a
single size and style of a particular typeface. ... After the
introduction of computer fonts based on fully scalable outlines, a
broader definition evolved. Font is no longer size-specific, but still
refers to a single style. Bulmer
regular, Bulmer italic, Bulmer bold and Bulmer bold italic are four fonts, but one typeface. However, the term font is
also often used as a synonym for typeface." (Wikipedia: Font).
- "A keyboard layout is any specific mechanical, visual,
or functional arrangement of the keys, legends, or
key–meaning associations (respectively) of a computer, typewriter, or other typographic keyboard." (Wikipedia: Keyboard
layout).
- Unicode "provides a
unique number for every character, no matter what the platform, no
matter what the program, no matter what the language." (The Unicode
Consortium: What is Unicode?)
Weeks Three and Four
Review assigned
readings!
- The information lifecycle - see
Content
Management System, Web
Content Management System, Component
Content Management System, and Document
Management System
- Authoring in
translation: A term for the process of writing a document.
"Authoring" seems to have
come into use in order to emphasize that document production involved
more than just writing.
- Information stakeholders: Any
individual who has an interest in and dependence on a set of data
or information. Stakeholders may include information producers,
knowledge workers, external customers, and regulatory bodies, as well
as various information systems roles such as database designers,
application developers, and maintenance personnel.
- Retrofitting
"multilingual": The idea that a product or service is
designed and brought to market exclusively in one language without
thought of the necessity to internationalize the product, its
packaging, or its design and accompanying materials. This includes
software, help files, and manuals. Problems with this approach are that
the product itself may be inappropriate for the foreign market, or that
written or electronic materials may need to be redesigned to
accommodate diffent length texts or cultural expectations.
- The language industry designs,
produces, and markets tools, products, or services related to
computerized language processing. It includes translation; editing, reviewing,
proofreading; CAT tools; terminology extraction, and localization. See Language_industry.
- Text inputs - Getting
text from paper to electronic form is necessary in order to effectively
store and re-use text, particularly in the context of CAT tools. Text
can be input by retyping, by scanning and optical character
recognition, or by dictation into a voice recognition system. All of
these approaches require proofreading, but particularly the latter two.
- OCR
- Optical Character Recognition: The conversion of
characters into an electronic form that can be manipulated in other
applications such as word processors.
- Conversions
- Files input from paper may be converted into a variety
of file formats. It may only be possible to obtain clean text by saving
scanned OCR text into a .TXT file, but in order to achieve correct
formatting, it is necessary to save it in a word-processing format such
as .DOC, .DOCX, or .ODT. More complex file formats such as PowerPoint
(.PPT) or Excel (.XLS) or similar display or spreadsheet programs may
need to be reproduced manually.
- Multilingual
resources:
- Dictionaries can
be either 1. print resources (the larger, the better), 2. CD-ROM
or downloadable, installable dictionaries, or 3. subscription services
on the Internet. The advantage of the electronic resources is more
effective and faster searching, including hits in entries that could
not be found with a headword search.
- Glossaries: A
glossary is an alphabetical list of terms in a particular domain of
knowledge with the definitions for those terms. A bilingual glossary is often an
alphabetical list of terms in one language with their equivalents in
another language, but without definitions.
- Terminology:
The set of all the terms that are specific to a
special subject field.Multilingual
terminology is often maintained using a terminology management
tool (such as MultiTerm), and, when properly done, includes terms,
definitions, and contexts in multiple languages.
- Parallel
text: 1. A text that represents the same text type as
the source text. These types of PTs provide information concerning
target audience expectations. 2. A text that treats the same or a
closely related topic in the same subject field and that serves as a
source for the mots justes
and terms that should idelaly be incorporated into the target text to
ensure collocation cohesion.
Web Resources