ACL RD-TEC 1.0 Summarization of N06-1004
Paper Title:
SEGMENT CHOICE MODELS: FEATURE-RICH MODELS FOR GLOBAL DISTORTION IN STATISTICAL MACHINE TRANSLATION
SEGMENT CHOICE MODELS: FEATURE-RICH MODELS FOR GLOBAL DISTORTION IN STATISTICAL MACHINE TRANSLATION
Authors: Roland Kuhn and Denis Yuen and Michel Simard and Patrick Paul and George Foster and Eric Joanis and Howard Johnson
Primarily assigned technology terms:
- algorithm
- alignment algorithm
- beam search
- beam-search
- bootstrap
- bootstrap resampling
- chinese-to-english translation
- computational linguistics
- decision tree
- decision trees
- decision-tree
- decoder
- decoding
- dt training
- dynamicprogramming
- expansion-pruning
- gelfand-ravishankar-delp expansion-pruning
- human language
- human language technology
- language technology
- learning
- learning technique
- lexical weighting
- machine learning
- machine translation
- modeling
- mt system
- optimization
- phrase alignment
- phrase translation
- phrase-based machine translation
- predictor
- resampling
- rescoring
- search
- segmentation
- smoothing
- software engineering
- statistical machine translation
- trainable decision tree
- training method
- tree-growing
- weight optimization
- weighting
Other assigned terms:
- alphabet
- analogy
- approach
- association for computational linguistics
- baseline model
- beam
- bias
- bleu
- case
- chinese corpus
- chinese words
- chunks
- contextual information
- corpora
- development set
- distribution
- english language
- english language model
- evaluation set
- experimental results
- heuristic
- hypotheses
- hypothesis
- ibm model
- labeling
- language model
- language models
- language pairs
- linguistics
- log-linear combination
- measure
- method
- nist
- pairwise system comparison
- parallel text
- perplexity
- phrase
- phrase translation model
- priori
- probabilistic model
- probabilities
- probability
- probability distributions
- procedure
- question types
- reordering
- segments
- sentence
- sentence pair
- sentences
- source sentence
- syntax
- target language
- technique
- technology
- test corpus
- test data
- text
- tokens
- training
- training corpus
- training data
- training set
- translation model
- tree
- trees
- trigram
- trigram language model
- uniform distribution
- uniform probability
- word
- word alignments
- word classes
- words