ACL RD-TEC 1.0 Summarization of W06-3113
Paper Title:
HOW MANY BITS ARE NEEDED TO STORE PROBABILITIES FOR PHRASE-BASED TRANSLATION?
HOW MANY BITS ARE NEEDED TO STORE PROBABILITIES FOR PHRASE-BASED TRANSLATION?
Authors: Marcello Federico and Nicola Bertoldi
Primarily assigned technology terms:
- algorithm
- approximation
- automatic speech recognition
- beam-search
- beam-search decoder
- binning method
- clustering
- data compression
- decoder
- decoding
- encoding
- indexing
- information retrieval
- k-means
- kneser-ney smoothing
- language processing
- large-vocabulary speech recognition
- local word reordering
- machine translation
- natural language processing
- partitioning
- phrase-based decoder
- phrase-based translation
- phrase-based translation approach
- phrase-based translation system
- processing
- pruning
- recognition
- reporting
- search
- smoothing
- smoothing method
- smt system
- speech recognition
- statistical machine translation
- translation process
- translation system
- word reordering
Other assigned terms:
- approach
- back-off model
- bleu
- bleu score
- bleu scores
- case
- codebook
- community
- compression ratio
- conditional distribution
- convergence
- corpora
- data set
- data sparseness
- data structure
- data structures
- distribution
- error rate
- experimental results
- feature
- heuristic
- hypotheses
- implementation
- language corpora
- language model
- language models
- language processing tasks
- large-vocabulary speech
- linguistic
- linguistic data
- log-linear model
- maps
- measures
- memory consumption
- method
- modality
- mt evaluation
- n-gram
- n-gram language model
- n-gram models
- n-grams
- natural language
- natural language processing tasks
- nist
- parallel corpus
- parallel texts
- permutation
- phrase
- phrase-based translation model
- posterior
- posterior probability
- probabilistic models
- probabilities
- probability
- process
- processing tasks
- punctuation
- reference translations
- reordering
- retrieval performance
- search space
- sentence
- sentences
- source language
- statistical approach
- statistics
- target languages
- target string
- term
- terms
- test data
- test set
- testing data
- text
- training
- training and test data
- training and testing data
- training data
- translation model
- translation quality
- translations
- unigram
- vocabulary
- word
- word alignments
- word string
- words