ACL RD-TEC 1.0 Summarization of E06-1014
Paper Title:
IMPROVING PROBABILISTIC LATENT SEMANTIC ANALYSIS WITH PRINCIPAL COMPONENT ANALYSIS
IMPROVING PROBABILISTIC LATENT SEMANTIC ANALYSIS WITH PRINCIPAL COMPONENT ANALYSIS
Authors: Ayman Farahat and Francine Chen
Primarily assigned technology terms:
- algorithm
- analyzer
- automatic indexing
- cluster refinement
- clustering
- computing
- computing machinery
- decomposition
- dimensionality reduction
- em algorithm
- expectation maximization
- hard clustering
- indexing
- information retrieval
- k-means
- k-means clustering
- latent semantic analysis
- learning
- likelihood estimation
- maximum likelihood
- maximum likelihood estimation
- maximum-likelihood
- modeling
- morphological analyzer
- normalization
- optimization
- parameter setting
- postprocessing
- principal component analysis
- ranking
- segmentation
- semantic analysis
- singular value decomposition
- text segmentation
- weighting
Other assigned terms:
- approach
- case
- cluster
- clusters
- co-occurrence
- co-occurrence matrix
- corpora
- correlation
- cosine distance
- data set
- dimensionality
- distribution
- document
- document collection
- document collections
- document vectors
- entropy
- error rate
- estimation
- evaluations
- exponential distribution
- external knowledge
- frequency counts
- implementation
- index
- index terms
- interpretation
- joint probability
- knowledge
- kullback-leibler divergence
- latent class
- latent semantic
- latent semantic space
- likelihood
- likelihood function
- mapping
- measure
- method
- model parameters
- model performance
- noise
- norm
- pair similarity
- polysemy
- precision
- probabilities
- probability
- probability distribution
- probability distributions
- probability model
- queries
- query
- relation
- representations
- running time
- segment boundaries
- semantic
- semantic space
- senses of a word
- sentence
- sentence level
- sentences
- similarity measure
- similarity scores
- statistics
- stem
- synonyms
- synonymy
- system description
- technology
- term
- term-document matrix
- terms
- test data
- text
- text similarity
- trained model
- training
- vector space
- word
- word count
- words