ACL RD-TEC 1.0 Summarization of E95-1010

Paper Title:
TEXT ALIGNMENT IN THE REAL WORLD: IMPROVING ALIGNMENTS OF NOISY TRANSLATIONS USING COMMON LEXICAL FEATURES, STRING MATCHING STRATEGIES AND N-GRAM COMPARISONS

Authors: Mark W. Davis and Ted E. Dunning and William C. Ogden

Other assigned terms:

  • abbreviations
  • alignment probability
  • approach
  • bilingual dictionaries
  • case
  • characters
  • chunks
  • co-occurrence
  • computational overhead
  • corpora
  • data set
  • derivation
  • dictionaries
  • dictionary
  • distribution
  • document
  • ellipsis
  • english language
  • english text
  • english translations
  • fact
  • feature
  • heuristic
  • heuristics
  • histogram
  • implementation
  • information sources
  • knowledge
  • language expression
  • lexical feature
  • lexical features
  • measure
  • measures
  • method
  • multi-lingual information
  • n-gram
  • n-gram match
  • n-grams
  • names
  • noise
  • norm
  • paragraph
  • paragraphs
  • parallel corpora
  • parallel text
  • parallel texts
  • phrase
  • posteriori probability
  • priori
  • probabilities
  • probability
  • probability density
  • procedure
  • process
  • proper names
  • questionnaire
  • segments
  • sentence
  • sentence boundaries
  • sentence boundary
  • sentences
  • source text
  • sources of information
  • standard deviation
  • statistics
  • technical terms
  • technique
  • term
  • terms
  • test corpus
  • text
  • text segments
  • training
  • training set
  • transcriptions
  • translations
  • understanding
  • uniform distribution
  • window size
  • word
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***