ACL RD-TEC 1.0 Summarization of P99-1005
Paper Title:
DISTRIBUTIONAL SIMILARITY MODELS: CLUSTERING VS. NEAREST NEIGHBORS
DISTRIBUTIONAL SIMILARITY MODELS: CLUSTERING VS. NEAREST NEIGHBORS
Primarily assigned technology terms:
- classification
- clustering
- cross-validation
- data compression
- data preparation
- data reduction
- distance function
- distributional clustering
- iterative reestimation
- lagrange multiplier
- language modeling
- language-processing
- modeling
- nearest neighbors
- normalization
- parser
- partial parser
- predictor
- probabilistic clustering
- probability estimation
- re-estimation
- re-estimation procedure
- recognition
- reestimation
- search
- semantic classification
- similarity-based estimation
- speech recognition
- splitting
- ten-fold cross-validation
- weighting
Other assigned terms:
- approach
- backoff
- bigram
- case
- cluster
- clustering model
- clusters
- co-occurrence
- compact representation
- concreteness
- conditional distribution
- conditional independence
- conditional probability
- convergence
- data sparseness
- distribution
- distributional similarity
- english text
- error rate
- estimation
- events
- fact
- feature
- head noun
- heuristic
- implementation
- independence assumption
- jensen-shannon divergence
- joint distribution
- joint probability
- kl divergence
- language model
- language models
- main verb
- measures
- method
- methodology
- model complexity
- model size
- mutual information
- noise
- nouns
- perplexity
- precision
- prediction accuracy
- probabilistic model
- probabilities
- probability
- probability distribution
- probability estimate
- probability estimates
- probability model
- procedure
- process
- recognition error rate
- schema
- semantic
- similarity measures
- sparse data
- sparse data problem
- speech recognition error
- standard deviation
- statistics
- term
- terms
- test data
- test set
- text
- training
- training data
- training phase
- training set
- verb
- weighting scheme
- word
- word co-occurrence
- word similarity
- words