ACL RD-TEC 1.0 Summarization of J05-4003
Paper Title:
IMPROVING MACHINE TRANSLATION PERFORMANCE BY EXPLOITING NON-PARALLEL CORPORA
IMPROVING MACHINE TRANSLATION PERFORMANCE BY EXPLOITING NON-PARALLEL CORPORA
Primarily assigned technology terms:
- algorithm
- alignment computation
- alignment method
- alignment template model
- article selection
- automatic lexical acquisition
- bootstrap
- bootstrap resampling
- bootstrapping
- candidate selection
- chinese-english mt
- classification
- classification process
- classifier
- classifier training
- classifiers
- computational linguistics
- computing
- crawling
- cross-language information retrieval
- data extraction
- database
- document matching
- document selection
- dynamic programming
- dynamic programming alignment
- dynamic programming approach
- entropy classifier
- extraction method
- extraction system
- feature extraction
- identification
- information retrieval
- learning
- lexical acquisition
- linking
- machine translation
- machine translation system
- machine translation systems
- matching
- maximum entropy
- maximum entropy classifier
- modeling
- mt system
- mt systems
- nlp
- normalization
- optimization
- paragraph alignment
- parallel sentence detection
- parallel sentence extraction
- parallel sentence selection
- parallel training
- processing
- question answering
- resampling
- search
- search engine
- sense tagging
- sentence alignment
- sentence alignment method
- sentence detection
- sentence extraction
- sentence extraction system
- sentence identification
- sentence selection
- smt system
- statistical machine translation
- statistical machine translation system
- statistical modeling
- statistical modeling framework
- statistical mt
- tagging
- translation system
- translation systems
- two-stage classification
- web-mining
- word alignment
- word alignment computation
- word sense tagging
Other assigned terms:
- alignment model
- alignment template
- annotation
- annotation projection
- approach
- association for computational linguistics
- bilingual dictionary
- bleu
- bleu scores
- case
- chinese-english corpus
- classification problem
- cluster
- comparable corpora
- comparable corpus
- content words
- contextual information
- corpora
- corpus size
- cosine similarity
- data consortium
- data sets
- dictionaries
- dictionary
- dictionary coverage
- dictionary entries
- discourse
- document
- english language
- english non-parallel newspaper corpora
- english sentence
- english translation
- entropy
- evaluation metric
- evaluations
- extraction process
- f-score
- feature
- feature weights
- foreign language
- foreign word
- french
- generative model
- heuristic
- heuristics
- ibm model
- implementation
- index
- language model
- language models
- language pair
- language pairs
- lexicon
- likelihood
- linear combination
- linguistic
- linguistic data
- linguistic data consortium
- linguistics
- machine translation performance
- meanings
- measure
- method
- monolingual corpora
- mt evaluation
- n-grams
- names
- nist
- nlp applications
- noise
- non-parallel corpora
- non-parallel corpus
- normalization factor
- out-of-domain corpus
- pair similarity
- paragraph
- parallel corpora
- parallel corpus
- parallel sentence
- parallel texts
- parallelism
- parameter values
- performance evaluation
- phrase
- political discourse
- precision
- probabilities
- probability
- procedure
- process
- processing time
- programming approach
- projection
- query
- reference translations
- sentence
- sentence length model
- sentence level
- sentence pair
- sentence similarity
- sentences
- similarity measure
- similarity score
- size of the corpus
- statistics
- substring
- test corpora
- test corpus
- test data
- test set
- tokens
- toolkit
- training
- training and test data
- training corpora
- training corpus
- training data
- training examples
- training set
- translation direction
- translation equivalents
- translation model
- translation pairs
- translation probabilities
- translations
- unigram
- web pages
- web site
- word
- word alignment model
- word alignments
- word order
- word sense
- words