ACL RD-TEC 1.0 Summarization of P06-1122
Paper Title:
MODELLING LEXICAL REDUNDANCY FOR MACHINE TRANSLATION
MODELLING LEXICAL REDUNDANCY FOR MACHINE TRANSLATION
Authors: David Talbot and Miles Osborne
Primarily assigned technology terms:
- algorithm
- approximation
- bayesian model selection
- bilingual clustering
- bilingual word clustering
- clustering
- computational linguistics
- computer vision
- decoder
- em algorithm
- estimation process
- hard-clustering
- identity mapping
- instantiation
- k-means
- k-means clustering
- lemmatisation
- likelihood hard-clustering
- machine translation
- marginal likelihood hard-clustering
- markov random field
- maximum likelihood
- model estimation
- model selection
- modelling
- morphological analysis
- optimisation
- parallel training
- parameter estimation
- phrase translation
- phrase-based translation
- scoring
- search
- semi-supervised clustering
- smoothing
- smt system
- statistical estimation
- statistical mt
- statistical translation
- translation model estimation
- translation process
- word clustering
- word-alignment
Other assigned terms:
- adjective
- annotation
- association for computational linguistics
- backoff
- backoff model
- bayesian model
- bias
- binary features
- bleu
- bleu scores
- case
- characters
- cluster
- clustering procedure
- clusters
- co-occurrence
- concepts
- conditional model
- corpora
- czech corpus
- data sets
- distribution
- estimation
- europarl corpus
- evaluation metric
- events
- feature
- feature set
- feature space
- french
- hypothesis
- ibm models
- implementation
- interpolation
- interpolation scheme
- knowledge
- language pair
- language pairs
- lemma
- lexical redundancy
- lexicon
- lexicon model
- likelihood
- linguistic
- linguistics
- mapping
- mappings
- maps
- method
- model complexity
- model parameters
- model structure
- monolingual corpora
- morphological annotation
- nouns
- number agreement
- parallel corpora
- parallel corpus
- parallel text
- parallel training corpus
- part of speech
- part of speech tags
- part-of-speech
- phrase
- phrase translation model
- phrase-based translation model
- priori
- probabilities
- probability
- probability distributions
- procedure
- process
- redundant information
- russian
- sentence
- sentence pair
- sentences
- smt lexicon
- source language
- sparse data
- statistical translation model
- statistics
- substring
- suffixes
- tags
- target language
- target languages
- target vocabulary
- television
- term
- terms
- test data
- text
- tokens
- training
- training corpus
- training data
- training set
- translation lexicon
- translation model
- translation models
- translation probabilities
- translation quality
- translation table
- translations
- treebank
- vocabulary
- word
- words