Introduction to Translation Standards

From Ott09 Wiki
Jump to: navigation, search

Main discussion topics:

  • Main translation standards introduced: TMX, XLIFF, TBX

Ideas for further discussion this week

  • Standards gaps: quality control, translation management workflow
  • Need for open source input into standards like XLIFF

Discussion Notes

  • building tools that are standards-based to make them more usable*
  • Why do we need standards in tools?
  • How can people reuse translation memory in their work (TM is a database of aligned translation pairs – there are standards for the size of units that are 1:1 matched – phrases, sentence, etc. good for a variety of purposes, like statistical machine translation)
  • TMX is an XML file, and the standard format for translation memories (source, target text, languages they are in)
  • Standards for timed-aligned translation?
  • XLIFF – take segmented documents (say Word, or pdf), open in XLIFF supporting tool and translate, save it and then convert to original document format. Supports human (rather than machine) translation tools
  • TBX – online simple dictionary format, stores terminology
  • TBX Basic - simpler form of TBX
  • TMI – text markup initiative – standards for lexicography, syntactic markup, probably
  • Problems: translations locked into translation tools
  • SRX – standard for specifying segmentation. Works with XLIFF to help you segment the source document. See OmegaT tool.
  • Standards for syntactic and semantic markup?
  • Need more open source input into XLIFF
  • ISO 639-3, Unicode, UTF 8 (briefly mentioned)
  • GMV trying to quantify the complexity of words (standard for counting words)
  • No standards that we know of around quality control