ACL RD-TEC 1.0 Summarization of P93-1024
Paper Title:
DISTRIBUTIONAL CLUSTERING OF ENGLISH WORDS
DISTRIBUTIONAL CLUSTERING OF ENGLISH WORDS
Authors: Fernando Pereira and Naftali Tishby
Primarily assigned technology terms:
- a statistical part-of-speech
- agglomerative clustering
- class-based modeling
- classification
- classification method
- classifiers
- clustering
- clustering technique
- computational learning
- decomposition
- disambiguation
- distributional clustering
- em method
- entropy maximization
- estimation method
- expression pattern matching
- fidditch parser
- grouping
- hierarchical clustering
- iterative process
- language analysis
- learning
- lexical acquisition
- matching
- maximum entropy
- maximum likelihood
- ml estimation
- modeling
- normalization
- noun classification
- object labeling
- optimization
- parameter estimation
- parser
- part-of-speech tagger
- pattern matching
- reestimation
- regular expression
- search
- similarity estimation
- smoothing
- splitting
- statistical part-of-speech tagger
- supervised learning
- tabulation
- tagger
- text analysis
- unsupervised learning
- word classification
Other assigned terms:
- analogy
- approach
- class membership
- classification problem
- cluster
- cluster centroid
- clustering model
- clustering procedure
- clusters
- concepts
- conditional distribution
- conditional probability
- corpora
- data set
- data sparseness
- derivation
- distribution
- distributional similarity
- encyclopedia
- entropy
- error rate
- estimation
- evaluations
- events
- experimental setting
- fact
- frequency counts
- frequency distribution
- grammar
- grammars
- grammatical relations
- head noun
- hypotheses
- joint distribution
- knowledge
- labeling
- language models
- leaf
- lexicalized grammar
- lexicalized tree-adjoining grammars
- likelihood
- linear combination
- linguistic
- log-likelihood
- main verb
- measure
- measures
- method
- model size
- n-grams
- natural language
- nouns
- numerical accuracy
- part-of-speech
- predictive power
- probabilities
- probability
- probability distributions
- procedure
- process
- regular expression pattern
- relation
- relative frequency
- similarity measure
- sparseness problem
- statistics
- tagged corpora
- technique
- terms
- test corpus
- test set
- text
- training
- training corpus
- training data
- training set
- tree-adjoining grammars
- understanding
- verb
- verb distribution
- verb similarity
- word
- word association
- word classes
- word distribution
- words