ACL RD-TEC 1.0 Summarization of W03-0310
Paper Title:
BOOTSTRAPPING PARALLEL CORPORA
BOOTSTRAPPING PARALLEL CORPORA
Authors: Chris Callison-Burch and Miles Osborne
Primarily assigned technology terms:
- active learning
- algorithm
- automatic construction
- bleu method
- bootstrap
- bootstrapping
- co-training
- co-training algorithm
- decoder
- giza
- harvesting
- internet
- isi rewrite decoder
- language modeling
- language processing
- learner
- learning
- learning method
- learning technique
- learning techniques
- machine learning
- machine translation
- machine translation system
- modeling
- morphology
- natural language processing
- nlp
- processing
- querying
- reverse engineering
- selection algorithm
- self-training
- sentence alignment
- statistical machine translation
- statistical natural language processing
- statistical nlp
- statistical translation
- supervised learning
- supervised learning technique
- supervised machine learning
- supervised method
- translation system
- translation systems
- translator
- vocabulary acquisition
- weakly supervised learning
- word reordering
- word reordering problem
Other assigned terms:
- adjective
- annotated corpora
- approach
- bilingual corpora
- bilingual corpus
- bleu
- candidate translation
- case
- community
- corpora
- english translation
- english translations
- english web
- error rate
- evaluation metrics
- experimental results
- fact
- french
- ibm model
- knowledge
- labeled training data
- language model
- language modeling toolkit
- language models
- language pair
- language pairs
- language processing tasks
- mappings
- meaning
- measure
- method
- modeling toolkit
- monolingual corpus
- natural language
- natural language processing tasks
- nlp community
- noise
- parallel corpora
- parallel corpus
- parallel texts
- penn treebank
- phrase
- process
- processing tasks
- reference translations
- reordering
- sentence
- sentences
- statistical models
- statistical natural language
- target language
- technique
- terms
- text
- toolkit
- training
- training corpora
- training corpus
- training data
- training material
- training set
- training size
- translation accuracy
- translation model
- translation models
- translation quality
- translations
- treebank
- vocabulary
- web page
- web pages
- word
- word error rate
- word form
- word order