ACL RD-TEC 1.0 Summarization of W00-1308
Paper Title:
ENRICHING THE KNOWLEDGE SOURCES USED IN A MAXIMUM ENTROPY PART-OF-SPEECH TAGGER
ENRICHING THE KNOWLEDGE SOURCES USED IN A MAXIMUM ENTROPY PART-OF-SPEECH TAGGER
Authors: Kristina Toutanvoa and Christopher D. Manning
Primarily assigned technology terms:
- algorithm
- approximation
- beam search
- capitalization
- classification
- classifier
- computing
- constrained optimization
- cutoff
- disambiguation
- estimation algorithm
- estimation procedure
- hidden markov
- hidden markov models
- instantiation
- iterative scaling
- learning
- learning methods
- loglinear
- machine learning
- machine learning methods
- markov model
- maximum entropy
- maximum entropy approach
- maximum entropy framework
- maximum entropy method
- maximum entropy model
- maximum likelihood
- model estimation
- modeling
- optimization
- parameter estimation
- part-of-speech prediction
- part-of-speech tagger
- part-of-speech tagging
- preprocessing
- search
- speech tagger
- tagger
- taggers
- tagging
- transformation-based learning
- unknown word tagging
- word tagging
Other assigned terms:
- ambiguity
- ambiguous word
- approach
- baseline model
- beam
- beam size
- case
- classification accuracy
- conditional distribution
- conditional probability
- conditional probability model
- confusion matrix
- contextual information
- data set
- development set
- dictionary
- distribution
- entropy
- estimation
- feature
- feature type
- feature weights
- implementation
- information sources
- joint distribution
- knowledge
- labeling
- lexical information
- likelihood
- linguistic
- markov models
- method
- modal verb
- model complexity
- n-gram
- negation
- noise
- noun category
- nouns
- part of speech
- part-of-speech
- particle
- particles
- parts of speech
- passage
- penn treebank
- prepositions
- probability
- probability distribution
- probability model
- procedure
- process
- proper noun
- proper noun category
- semantic
- sentence
- sources of information
- speech tag
- statistic
- statistics
- stem
- suffixes
- symbol
- syntactic category
- tag sequence
- tagged text
- tagging accuracy
- tags
- term
- terms
- test set
- text
- tokens
- training
- training data
- treebank
- verb
- verb forms
- word
- word features
- words