TRST 20009 - The Multilingual Information Lifecycle

Student webpages

Alignment page

Tag editor page

Weeks One and Two
  1. Translation is "an interlinguistic transfer procedure comprising the interpretation of the sense of a source text and the production of a target text with the intent of establishing a relationship of equivalence between the two texts, while at the same time observing both the inherent communication parameters and the constraints imposed on the translator." (From: Terminologie de la traduction - Translation Terminology, p. 188)
  2. Globalization (G11N), internationalization (I18N), and localization (L10N) are defined and explained at the Localisation Industry Standards Association (LISA) website under "About Globalisation".
  3. Locale is "a set of parameters that defines the user's language, country and any special variant preferences that the user wants to see in their user interface. Usually a locale identifier consists of at least a language identifier and a region identifier." (Wikipedia: Locale). See also ISO 3166-2 for two-letter country codes and ISO 639-1 for two- and three-letter language codes.
  4. Character encoding "consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data (generally numbers and/or text) through telecommunication networks or storage of text in computers. Other terms like character encoding, character set (charset), and sometimes character map or code page are used almost interchangeably." (Wikipedia: Character encoding)
  5. "In typography, a font (also fount) is traditionally defined as a quantity of sorts composing a complete character set of a single size and style of a particular typeface. ... After the introduction of computer fonts based on fully scalable outlines, a broader definition evolved. Font is no longer size-specific, but still refers to a single style. Bulmer regular, Bulmer italic, Bulmer bold and Bulmer bold italic are four fonts, but one typeface. However, the term font is also often used as a synonym for typeface." (Wikipedia: Font).
  6. "A keyboard layout is any specific mechanical, visual, or functional arrangement of the keys, legends, or key–meaning associations (respectively) of a computer, typewriter, or other typographic keyboard." (Wikipedia: Keyboard layout).
  7. Unicode "provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language." (The Unicode Consortium: What is Unicode?)

Weeks Three and Four

Review assigned readings!
  1. The information lifecycle - see Content Management System, Web Content Management System, Component Content Management System, and  Document Management System
  2. Authoring in translation: A term for the process of writing a document. "Authoring" seems to have come into use in order to emphasize that document production involved more than just writing.
  3. Information stakeholders: Any individual who has an interest in and dependence on a set of data or information. Stakeholders may include information producers, knowledge workers, external customers, and regulatory bodies, as well as various information systems roles such as database designers, application developers, and maintenance personnel.
  4. Retrofitting "multilingual": The idea that a product or service is designed and brought to market exclusively in one language without thought of the necessity to internationalize the product, its packaging, or its design and accompanying materials. This includes software, help files, and manuals. Problems with this approach are that the product itself may be inappropriate for the foreign market, or that written or electronic materials may need to be redesigned to accommodate diffent length texts or cultural expectations.
  5. The language industry designs, produces, and markets tools, products, or services related to computerized language processing. It includes translation; editing, reviewing, proofreading; CAT tools; terminology extraction, and localization. See Language_industry.
  6. Text inputs - Getting text from paper to electronic form is necessary in order to effectively store and re-use text, particularly in the context of CAT tools. Text can be input by retyping, by scanning and optical character recognition, or by dictation into a voice recognition system. All of these approaches require proofreading, but particularly the latter two.
  7. OCR - Optical Character Recognition: The conversion of characters into an electronic form that can be manipulated in other applications such as word processors.
  8. Conversions - Files input from paper may be converted into a variety of file formats. It may only be possible to obtain clean text by saving scanned OCR text into a .TXT file, but in order to achieve correct formatting, it is necessary to save it in a word-processing format such as .DOC, .DOCX, or .ODT. More complex file formats such as PowerPoint (.PPT) or Excel (.XLS) or similar display or spreadsheet programs may need to be reproduced manually.
  9. Multilingual resources: 

Web Resources