ACL RD-TEC 1.0 Summarization of W99-0625
Paper Title:
DETECTING TEXT SIMILARITY OVER SHORT PASSAGES: EXPLORING LINGUISTIC FEATURE COMBINATIONS VIA MACHINE LEARNING
DETECTING TEXT SIMILARITY OVER SHORT PASSAGES: EXPLORING LINGUISTIC FEATURE COMBINATIONS VIA MACHINE LEARNING
Authors: Vasileios Hatzlvassiloglou and Judith L. Klavans and Eleazar Eskin
Primarily assigned technology terms:
- algorithm
- classification
- classifier
- clustering
- clustering algorithm
- computational linguistics
- computing
- cross-validation
- cutoff
- detection and tracking
- identification
- induction
- information retrieval
- information retrieval system
- inner product
- learning
- learning algorithm
- learning method
- machine learning
- machine learning algorithm
- matching
- normalization
- noun matching
- relative distance
- retrieval system
- rule induction
- semantic typing
- similarity identification
- splitting
- statistical learning
- summarization
- summarization system
- text analysis
- text matching
- text summarization
- topic detection
- topic detection and tracking
- validation
Other assigned terms:
- ambiguity
- annotated corpus
- annotation
- annotation process
- approach
- bias
- case
- co-occurrence
- community
- concept
- concepts
- document
- document frequency
- document similarity
- empirical results
- entailment
- experimental results
- fact
- feature
- feature set
- feature value
- feature vector
- generation
- implementation
- information retrieval community
- information theory
- input text
- inter-reviewer agreement
- kappa
- kappa statistic
- linguistic
- linguistic information
- linguistics
- linguistics literature
- measure
- measures
- mechanisms
- method
- methodology
- n-grams
- named entity
- names
- nist
- noun phrase
- noun phrases
- nouns
- opinions
- paragraph
- paragraphs
- paraphrase
- part-of-speech
- parts of speech
- phrase
- precision
- probability
- procedure
- process
- proper noun
- query
- random sample
- recursion
- relative frequency
- segments
- semantic
- semantic class
- semantic classes
- semantic distance
- semantic features
- semantic relations
- sense information
- sentences
- similarity definition
- similarity measure
- similarity metric
- sparse data
- sparse data problem
- statistic
- statistics
- stems
- stopword list
- summarization problem
- synonyms
- synonymy
- synset
- synsets
- tdt corpus
- technique
- terms
- text
- text length
- text similarity
- textual units
- tf \* idf
- theory
- training
- training set
- transcripts
- verb
- word
- word co-occurrence
- wordnet
- words