ACL RD-TEC 1.0 Summarization of P04-2003
Paper Title:
SEARCHING FOR TOPICS IN A LARGE COLLECTION OF TEXTS
SEARCHING FOR TOPICS IN A LARGE COLLECTION OF TEXTS
Authors: Martin Holub and Jiri Semecky and Jiri Divis
Primarily assigned technology terms:
- algorithm
- approximation
- clustering
- clustering algorithm
- clustering method
- computational linguistics
- decomposition
- document clustering
- document indexing
- document ranking
- document retrieval
- estimator
- fuzzy clustering
- hard clustering
- heuristic method
- indexing
- information retrieval
- java
- k-means
- latent semantic indexing
- learning
- linear programming
- linear regression
- local optimization
- lu decomposition
- machine learning
- matching
- matrix inversion
- morphology
- nlp
- optimization
- ranking
- regression
- regression algorithm
- retrieving
- search
- search algorithm
- searching
- semantic indexing
- terminology
Other assigned terms:
- ambiguity
- annotated test collection
- annotator
- approach
- cluster
- clusters
- computational complexity
- concept
- concepts
- correlation
- cosine similarity
- dimensionality
- distance metric
- document
- document frequency
- document similarity
- document vectors
- experimental results
- feature
- heuristic
- human annotator
- implementation
- intention
- kullback-leibler divergence
- latent semantic
- linguistics
- method
- nist
- nouns
- priori
- probability
- probability distributions
- procedure
- process
- queries
- query
- representations
- search procedure
- seed
- semantic
- similarity threshold
- statistics
- technique
- term
- terms
- test collection
- text
- text collection
- text documents
- time complexity
- topics
- training
- training samples
- transformation
- words