ACL RD-TEC 1.0 Summarization of P06-2030
Paper Title:
USING BILINGUAL COMPARABLE CORPORA AND SEMI-SUPERVISED CLUSTERING FOR TOPIC TRACKING
USING BILINGUAL COMPARABLE CORPORA AND SEMI-SUPERVISED CLUSTERING FOR TOPIC TRACKING
Authors: Fumiyo Fukumoto and Yoshimi Suzuki
Primarily assigned technology terms:
- algorithm
- broadcasting
- classification
- clustering
- clustering algorithm
- clustering method
- clustering technique
- computational linguistics
- computing
- cross-language ir
- em algorithm
- k-means
- k-means clustering
- learning
- learning techniques
- maximum-likelihood
- morphological analysis
- normalization
- part-of-speech tagger
- quantitative evaluation
- recognition
- semi-supervised clustering
- smoothing
- speech recognition
- splitting
- statistical techniques
- tagger
- tdt tracking
- term extraction
- topic tracking
- tracking procedure
- unsupervised learning
- vector representation
- weighting
Other assigned terms:
- annotation
- approach
- association for computational linguistics
- benchmark
- bigram
- bilingual corpora
- bilingual dictionary
- bilingual term
- class distribution
- cluster
- clusters
- comparable corpora
- conditional distribution
- contingency table
- corpora
- correlation
- cosine similarity
- dictionary
- distribution
- empirical results
- english corpus
- evaluation data
- evaluation measures
- evaluations
- events
- feature
- gold standard
- hypothesis
- incremental approach
- japanese corpus
- knowledge
- large training
- likelihood
- linguistics
- log-likelihood
- measures
- method
- n-gram
- n-gram model
- nouns
- parallel corpus
- part-of-speech
- precision
- priori
- probabilities
- probability
- probability value
- procedure
- proper noun
- queries
- query
- statistics
- system description
- technique
- term
- terms
- tf \* idf
- topics
- training
- training data
- user
- word
- word distribution
- words