ACL RD-TEC 1.0 Summarization of W05-0804
Paper Title:
BILINGUAL WORD SPECTRAL CLUSTERING FOR STATISTICAL MACHINE TRANSLATION
BILINGUAL WORD SPECTRAL CLUSTERING FOR STATISTICAL MACHINE TRANSLATION
Authors: Bing Zhao and Eric P. Xing and Alex Waibel
Primarily assigned technology terms:
- algorithm
- bilingual clustering
- bilingual clustering algorithm
- bilingual word clustering
- chinese word segmentation
- clustering
- clustering algorithm
- computing
- decoder
- decomposition
- eigenstructure computation
- forward-backward training
- greedy search
- hierarchical clustering
- hmms
- inner product
- k-means
- k-means clustering
- kernel
- kernels
- language modelling
- language processing
- machine translation
- maximum likelihood
- modelling
- natural language processing
- optimization
- optimization algorithm
- partitioning
- phrase extraction
- phrase-based decoder
- postprocessing
- preprocessing
- processing
- search
- segmentation
- sentence splitting
- singular value decomposition
- smoothing
- smt system
- spectral clustering
- splitting
- statistical machine translation
- statistical natural language processing
- tokenization
- two-step optimization
- viterbi
- viterbi alignment
- word aligner
- word alignment
- word clustering
- word segmentation
- word translation
- word-to-word translation
Other assigned terms:
- alignment accuracy
- approach
- bigram
- bilingual corpus
- bleu
- bleu score
- case
- chinese sentence
- chinese word
- cluster
- clusters
- co-occurrence
- co-occurrence matrix
- concept
- concepts
- conditional probability
- corpora
- correlation
- correlations
- data set
- data sparseness
- decision rule
- development set
- dimensionality
- english sentence
- english vocabulary
- evaluation metrics
- evaluation test
- experimental results
- f-measure
- fact
- feature
- feature space
- french
- french sentence
- french word
- histogram
- hmm model
- hypotheses
- joint probability
- kernel function
- language model
- large training
- lexicon
- likelihood
- mapping
- maps
- meaning
- measure
- measures
- method
- natural language
- nist
- noise
- oracle
- parallel corpus
- parallel sentence
- perplexity
- phrase
- phrase level
- prior probability
- probabilities
- probability
- process
- semantic
- semantic meaning
- sentence
- sentence pair
- sentences
- similarity matrix
- source language
- source sentence
- sparse data
- sparse data problem
- statistical natural language
- statistics
- style
- syntax
- target language
- target sentence
- terms
- test data
- test set
- training
- training corpus
- training data
- transition probability
- translation accuracy
- translation candidates
- translation equivalence
- translation lexicon
- translation model
- translation models
- translation probabilities
- translation probability
- translation quality
- translational equivalence
- translations
- trigram
- trigram language model
- viterbi path
- vocabulary
- word
- word alignment accuracy
- word alignments
- word co-occurrence
- word pair
- words