ACL RD-TEC 1.0 Summarization of W96-0103
Paper Title:
HIERARCHICAL CLUSTERING OF WORDS AND APPLICATION TO NLP TASKS
HIERARCHICAL CLUSTERING OF WORDS AND APPLICATION TO NLP TASKS
Primarily assigned technology terms:
- algorithm
- beam search
- classification
- clustering
- clustering algorithm
- clustering method
- clustering technique
- computing
- decision tree
- decision-tree
- decision-tree part-of-speech tagger
- example-based machine translation
- hierarchical clustering
- information clustering
- machine translation
- machine translation system
- maximum entropy
- nlp
- parser
- parsing
- parsing system
- part-of-speech tagger
- part-of-speech tagging
- partitioning
- sampling
- search
- smoothing
- splitting
- statistical parser
- tagger
- tagging
- translation system
- word clustering
Other assigned terms:
- 1-gram word distribution
- american english
- approach
- beam
- bigram
- binary tree
- case
- class information
- clusters
- compounds
- conditional probability
- conditional probability distribution
- corpora
- data sparseness
- data sparseness problem
- distribution
- entropy
- error rate
- events
- feature
- feature value
- language models
- language use
- large corpus
- leaf
- likelihood
- linguist
- linguistic
- main verb
- meaning
- method
- morphological features
- mutual information
- n-gram
- nlp tasks
- noun phrase
- parse
- part-of-speech
- pennsylvania treebank
- perplexity
- phrase
- prepositional phrase
- probability
- probability distribution
- probability distributions
- procedure
- process
- quantitative information
- root node
- semantic
- sentence
- sentence structure
- sentences
- sparseness problem
- statistics
- subclass
- substitutability
- subtree
- syntactic feature
- syntactic tag
- tag set
- tags
- technique
- telecommunications research
- terms
- test data
- test phase
- text
- time complexity
- training
- training and test data
- training data
- training phase
- training set
- tree
- tree representation
- treebank
- treebank project
- verb
- vocabulary
- wall street journal corpus
- word
- word distribution
- words
- wsj corpus