ACL RD-TEC 1.0 Summarization of W01-1412
Paper Title:
A COMPARATIVE STUDY ON TRANSLATION UNITS FOR BILINGUAL LEXICON EXTRACTION
A COMPARATIVE STUDY ON TRANSLATION UNITS FOR BILINGUAL LEXICON EXTRACTION
Authors: Kaoru Yamamoto and Yuji Matsumoto and Mihoko Kitamura
Primarily assigned technology terms:
- algorithm
- automatic extraction
- bilingual lexicon extraction
- chunking
- example-based mt
- extraction algorithm
- hidden markov
- hidden markov model
- learning
- learning techniques
- lexicon extraction
- machine learning
- machine learning techniques
- markov model
- matching
- nlp
- pair extraction
- parsers
- preprocessing
- segmentation
- statistical methods
- structural matching
- taggers
- text chunking
- tokenization
- translation pair extraction
- word segmentation
Other assigned terms:
- ambiguity
- ambiguity problem
- approach
- baseline model
- bilingual corpora
- bilingual dictionary
- bilingual lexicon
- bunsetsu
- chunk
- chunks
- co-occurrence
- co-occurrence frequency
- co-occurrences
- coefficient
- comparative study
- compounds
- corpora
- correlation
- data sparseness
- dependency relations
- dependency-linked n-gram
- dice
- dice coefficient
- dictionary
- experimental results
- experimental setting
- fact
- functional word
- generation
- idiomatic expressions
- knowledge
- kyoto university corpus
- lexicon
- linguistic
- linguistic information
- linguistic knowledge
- measures
- method
- monolingual corpora
- n-gram
- n-gram models
- n-grams
- named entities
- ngram
- noise
- parallel corpora
- part-of-speech
- part-of-speech tag
- penn treebank
- precision
- prepositions
- process
- qualitative analysis
- seed
- sentence
- sentences
- statistical model
- tag set
- terms
- test data
- text
- tokens
- training
- training data
- translation pair
- translation pairs
- treebank
- treebank part-of-speech tag set
- word
- word sequences
- words