ACL RD-TEC 1.0 Summarization of N04-1040
Paper Title:
MULTIPLE SIMILARITY MEASURES AND SOURCE-PAIR INFORMATION IN STORY LINK DETECTION
MULTIPLE SIMILARITY MEASURES AND SOURCE-PAIR INFORMATION IN STORY LINK DETECTION
Authors: Francine Chen and Ayman Farahat and Thorsten Brants
Primarily assigned technology terms:
- analyzer
- automatic speech recognition
- categorization
- classification
- classifier
- classifier combination
- classifiers
- computing
- decision tree
- decision trees
- grouping
- kernel
- language modeling
- learner
- learning
- learning algorithms
- learning techniques
- link detection
- link detection system
- machine learning
- machine learning algorithms
- machine learning techniques
- majority voting
- modeling
- morphological analyzer
- normalization
- polynomial kernel
- post-processing
- preprocessing
- processing
- recognition
- segmentation
- speech recognition
- statistical characterization
- story link detection
- support vector machine
- support vector machines
- svm-light
- term weighting
- terminology
- text categorization
- text segmentation
- voting
- weighting
- weka
Other assigned terms:
- abbreviations
- broadcast news
- categorization task
- characters
- confidence score
- content words
- cosine distance
- cosine measure
- cosine similarity
- cosine similarity measure
- detection task
- distribution
- document
- document frequency
- euclidean distance
- events
- experimental results
- feature
- frequency counts
- implementation
- inverse document frequency
- kl divergence
- kullback-leibler divergence
- language pair
- measure
- measures
- method
- modality
- pair similarity
- probabilistic measure
- probability
- probability distribution
- recognition errors
- similarity measure
- similarity measures
- similarity metrics
- similarity score
- similarity scores
- source language
- statistical information
- statistics
- stems
- story pair similarity
- support vector
- svms
- system description
- technology
- term
- terms
- test data
- text
- text documents
- tokens
- topics
- trained model
- training
- training corpus
- training data
- training set
- tree
- trees
- vocabulary
- word
- word distribution
- words