ACL RD-TEC 1.0 Summarization of P05-1032
Paper Title:
SCALING PHRASE-BASED STATISTICAL MACHINE TRANSLATION TO LARGER CORPORA AND LONGER PHRASES
SCALING PHRASE-BASED STATISTICAL MACHINE TRANSLATION TO LARGER CORPORA AND LONGER PHRASES
Authors: Chris Callison-Burch and Colin Bannard and Josh Schroeder
Primarily assigned technology terms:
- algorithm
- approximation
- binary search
- caching
- computing
- crawler
- decoder
- decoding
- estimator
- extraction technique
- final state
- language processing
- lexical weighting
- likelihood estimator
- longest matching
- machine translation
- machine translation evaluation
- matching
- maximum likelihood
- maximum likelihood estimator
- natural language processing
- phrase alignment
- phrase extraction
- phrase retrieval
- phrase translation
- phrase-based machine translation
- phrase-based statistical machine translation
- phrase-based translation
- processing
- retrieving
- sampling
- search
- search algorithm
- searching
- smoothing
- statistical machine translation
- statistical natural language processing
- translation evaluation
- weighting
- word alignment
Other assigned terms:
- adjective
- alignment template
- approach
- arabic-english parallel corpus
- array
- bleu
- case
- case information
- compact representation
- computational complexity
- corpora
- data consortium
- data set
- data sets
- data structure
- data structures
- disk
- distortion probability
- english corpus
- english verb
- english verb particle
- estimation
- europarl corpus
- evaluation metric
- evaluation set
- fact
- foreign words
- french
- ibm model
- index
- joint probability
- language model
- language model probability
- language pairs
- language processing applications
- large corpora
- large corpus
- likelihood
- linguistic
- linguistic information
- linguistic phenomena
- linguistics
- linguistics data
- mappings
- measure
- method
- model probability
- n-gram
- n-grams
- natural language
- natural language processing applications
- negation
- nist
- nouns
- parallel corpora
- parallel corpus
- particle
- phrase
- phrase-based model
- precision
- probabilities
- probability
- probability estimates
- procedure
- reordering
- retrieval time
- segments
- sentence
- sentences
- statistical natural language
- statistics
- substring
- suffix
- suffixes
- technique
- terms
- test set
- testing set
- text
- training
- training data
- training data set
- translation probabilities
- translation probability
- translation quality
- translation table
- translations
- verb
- verb forms
- word
- word alignments
- word order
- words