ACL RD-TEC 1.0 Summarization of P98-2124
Paper Title:
WORD CLUSTERING AND DISAMBIGUATION BASED ON CO-OCCURRENCE DATA
WORD CLUSTERING AND DISAMBIGUATION BASED ON CO-OCCURRENCE DATA
Authors: Hang Li and Naoki Abe
Primarily assigned technology terms:
- algorithm
- clustering
- clustering algorithm
- clustering method
- cross validation
- data base
- data compression
- disambiguation
- disambiguation method
- encoding
- error-driven learning
- estimation algorithm
- estimation method
- greedy heuristic
- hard clustering
- hard-clustering
- heuristic algorithm
- learning
- learning process
- likelihood estimate
- likelihood estimation
- maximum likelihood
- maximum likelihood estimation
- model selection
- parameter estimation
- pp-attachment disambiguation
- qualitative evaluation
- soft clustering
- statistical estimation
- structural disambiguation
- syntactic disambiguation
- transformation-based error-driven learning
- validation
- word clustering
Other assigned terms:
- bigram
- bigram model
- case
- cluster
- clustering model
- co-occurrence
- co-occurrences
- compound noun
- conditional probabilities
- data sparseness
- data sparseness problem
- distribution
- estimation
- experimental results
- fact
- feature
- heuristic
- heuristic rules
- implementation
- information theory
- joint probability
- joint probability distribution
- likelihood
- likelihood function
- maximum likelihood estimate
- mdl principle
- method
- minimum description length
- model selection criterion
- mutual information
- nouns
- penn tree bank
- phrase
- pp-attachment
- prepositions
- probabilities
- probability
- probability distribution
- probability distributions
- probability model
- procedure
- process
- sentences
- sparseness problem
- term
- terms
- test data
- theory
- thesaurus
- time complexity
- training
- training and test data
- training data
- tree
- tree bank
- verb
- verb class
- verb classes
- verb forms
- word
- word classes
- wordnet
- words
- wsj corpus