Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automating Bilingual Terminology Creation

Automating Bilingual Terminology Creation

Automating Bilingual Terminology Creation - TAUS Annual Conference 2015 in San Jose

David Meikle

October 12, 2015
Tweet

Other Decks in Technology

Transcript

  1. Democratizing Quality Love your language “Translation buyers shouldn’t have to

    buy yet more overpriced technology from vendors who fail to take ultimate responsibility for quality” Christian Arno, CEO Lingo24
  2. Terminology and Its Impact Love your language Upstream effects on

    content creation ü Higher quality documentation as authors create less variation based on validated terms ü Less time as authors require less research of what term is appropriate Cost of translation ü Consistently created source texts are easier to translate ü More classic TM matches ü Shorten translation review time as QA checks capture errors at source ü Dramatic increase in consistency = high quality MT training material Overall quality ü Building TermBank becomes easier once there is a qualified base to build from ü Linguist feedback is more relevant as they help validate the terms in context Terminology is an invaluable asset in effective translation and localization…
  3. Terminology and Its Impact Love your language High entry barriers

    ü Need for enterprise terminology strategy ü How to build a TermBank? ü Who approves terms? ü How to maintain a TermBank? Need to implement terminology tools ü Central storage/repository ü Application of terms to translations ü Quality assurance to ensure consistency ü Cost of tool We have found terminology is not always used to its potential given…
  4. How It Works? An abridged version… Love your language Lingo24

    uses a refreshed statistical approach by combining techniques that look at the data differently than traditional frequency based methods Sentence Aligned Content (TMX, XLIFF, XLSX) Bilingual TermBank (TMX, XLIFF, XLSX) Identify Application of log likelihood comparison method to extract monolingual terminology from source and target. Rank Use phrase-based statistical Machine Translation to align, rank and trim bilingual terminology. < Log likelihood Comparison + MT Pipeline + Custom Features >
  5. Love your language Powers Our Approach ü Log likelihood of

    TM vs generic terms ü Statistical machine translation alignment ü Custom feature ranking ü Evaluation of automated extraction results (source and target candidates) ü No/Yes interface for human review of unqualified term candidates ü Research and add quality meta-data ü Approval of results for inclusion in TermBank ü Access to TM and concordances ü Access to comments and meta information ü Scalable authoring and approval structure ü All work performed based on relevant and validated data ü Client head terminologist ü Client language terminologists ü Qualified Lingo24 linguists ü Casual terminologists / SMEs Automated Terms Right People Tailored Tasks One Location
  6. Love your language The Benefits ü Automatic creation of high-quality,

    domain-specific bilingual glossaries ü User-friendly reviewing and editing of automatically created glossary ü Monolingual glossary creation from source documents ü Reduced cost and time of terminology creation compared to industry norms ü Increased consistency and accuracy across projects involving multiple linguists ü Ability to create terminology-aware Machine Translation output ü High quality glossaries immediately in the hands of linguists and reviewers