ACL RD-TEC 1.0 Summarization of W06-3105
Paper Title:
WHY GENERATIVE PHRASE MODELS UNDERPERFORM SURFACE HEURISTICS
WHY GENERATIVE PHRASE MODELS UNDERPERFORM SURFACE HEURISTICS
Authors: John DeNero and Dan Gillick and James Zhang and Dan Klein
Primarily assigned technology terms:
- algorithm
- approximation
- computing
- decoder
- decoding
- decomposition
- em algorithm
- em training
- end-to-end machine translation
- expectation maximization
- expectation maximization algorithm
- heuristic interpolation
- language modeling
- learning
- learning approach
- learning process
- likelihood training
- machine translation
- machine translation system
- maximization algorithm
- maximum likelihood
- maximum likelihood training
- modeling
- parameterization
- phrase translation
- phrase-based decoding
- phrase-based machine translation
- phrase-based translation
- pruning
- re-estimation
- segmentation
- smoothing
- sri language modeling
- statistical learning
- training procedure
- training process
- translation system
- tuning
- weighting
- word alignment
- word-alignment
Other assigned terms:
- alignment models
- ambiguity
- approach
- bleu
- bleu score
- bleu scores
- conditional distribution
- conditional probability
- correlation
- data set
- distribution
- english sentence
- english translations
- entropy
- europarl corpus
- evaluation methodology
- fact
- french
- french corpus
- french sentence
- generative model
- generative models
- heuristic
- heuristics
- histogram
- ibm model
- idiom
- interpolation
- language model
- language modeling toolkit
- language pair
- likelihood
- mapping
- methodology
- model parameters
- modeling toolkit
- noisy channel
- non-compositionality
- phrase
- phrase translation model
- phrase-based model
- probabilistic model
- probabilities
- probability
- procedure
- process
- sentence
- sentence pair
- sentence position
- sentences
- set size
- statistical framework
- statistical models
- statistics
- stems
- technique
- test data
- test set
- toolkit
- training
- training and test data
- training corpus
- training data
- training set
- training set size
- translation ambiguity
- translation candidate
- translation direction
- translation model
- translation models
- translation probabilities
- translation quality
- translations
- uniform distribution
- word
- word alignments
- word level
- words