ACL RD-TEC 1.0 Summarization of N04-1021
Paper Title:
A SMORGASBORD OF FEATURES FOR STATISTICAL MACHINE TRANSLATION
A SMORGASBORD OF FEATURES FOR STATISTICAL MACHINE TRANSLATION
Authors: Franz Josef Och and Daniel Gildea and Sanjeev Khudanpur and Anoop Sarkar and Kenji Yamada and Alex Fraser and Shankar Kumar and Libin Shen and David Smith and Katherine Eng and Viren Jain and Zhen Jin and Dragomir Radev
Primarily assigned technology terms:
- algorithm
- alignment template system
- bootstrap
- bootstrap resampling
- broadcast information service
- chunking
- collins parser
- decomposition
- direct translation
- discriminative reranking
- discriminative training
- feature combination
- giza
- graph representation
- greedy approach
- greedy search
- greedy search algorithm
- language modeling
- list rescoring
- log-linear feature combination
- log-linear modeling
- machine translation
- markov model
- modeling
- mt system
- n-best reranking
- optimization
- parallel training
- parser
- parsers
- parsing
- part-of-speech tagger
- part-of-speech tagging
- phrase alignment
- phrase translation
- preprocessing
- pruning
- re-ranking
- reranking
- resampling
- rescoring
- scoring
- search
- search algorithm
- smoothing
- sri language modeling
- statistical machine translation
- statistical mt
- statistical parser
- statistical parsers
- syntactic analysis
- syntax-based translation
- tagger
- taggers
- tagging
- training algorithm
- training process
- translation systems
- tree transformation
- tree-to-string alignment
- tree-to-tree alignment
- verb phrases
- word alignment
- word selection
Other assigned terms:
- alignment information
- alignment model
- alignment template
- approach
- baseline model
- benchmark
- bleu
- bleu metric
- bleu score
- bleu scores
- candidate translation
- case
- chinese dependency
- chinese sentence
- chinese words
- chinese-english lexicon
- chunk-aligned parallel training corpora
- co-occurrence
- coherence
- composition
- conditional model
- conditional probability
- content words
- corpora
- data set
- decision rule
- dependency parse
- dependency trees
- derivation
- derivation tree
- development set
- elementary tree
- english parse
- english parse tree
- english translation
- english tree
- evaluation metric
- evaluations
- fact
- feature
- feature value
- feature weights
- formalism
- generative models
- genre
- gold standard
- grammar
- grammaticality
- head-word
- heuristics
- hypothesis
- ibm model
- language model
- language model probability
- language modeling toolkit
- leaf
- lexical co-occurrence
- lexical translation
- lexicon
- lexicon entries
- lexicon entry
- log-linear combination
- log-linear model
- log-linear models
- measure
- measures
- method
- methodology
- model parameter
- model parameters
- model probability
- modeling toolkit
- mt quality
- n-best list
- news corpus
- optimization problem
- oracle
- oracle translation
- parallel training corpora
- parallel training corpus
- parameter values
- parse
- parse tree
- parse tree probability
- part-of-speech
- parts of speech
- phrase
- phrase level
- pos tag
- posterior
- posterior probability
- probabilities
- probability
- probability model
- process
- punctuation
- reference translations
- relative frequency
- reordering
- representations
- research topic
- search space
- semantic
- semantic coherence
- sentence
- sentence pair
- sentences
- shallow syntactic feature
- source language
- source language word
- source sentence
- subtree
- subtrees
- syntactic feature
- syntactic features
- syntactic representation
- syntactic tree
- syntax
- tag derivation
- tag derivation tree
- tag sequence
- target language
- target languages
- target sentence
- technique
- test corpus
- test data
- test set
- text
- toolkit
- training
- training corpora
- training corpus
- training data
- training text
- transformation
- translation candidates
- translation model
- translation models
- translation pair
- translation probabilities
- translation probability
- translation quality
- translations
- tree
- tree-adjoining grammar
- treebank
- trees
- trigram
- trigram language model
- unigram
- unigram model
- unigram probability
- verb
- verb phrase
- verb tag
- wall street journal text
- word
- word alignments
- word pair
- words