ACL RD-TEC 1.0 Summarization of P99-1023
Paper Title:
A SECOND-ORDER HIDDEN MARKOV MODEL FOR PART-OF-SPEECH TAGGING
A SECOND-ORDER HIDDEN MARKOV MODEL FOR PART-OF-SPEECH TAGGING
Authors: Scott M. Thede and Mary P. Harper
Primarily assigned technology terms:
- algorithm
- approximation
- capitalization
- cd-rom
- classification
- classifier
- classifiers
- cross validation
- cross-validation
- data estimation
- fixed smoothing
- good-turing method
- hidden markov
- hidden markov model
- hidden markov models
- hmm tagger
- hmms
- language processing
- lexical smoothing
- longest matching
- markov model
- matching
- natural language processing
- nlp
- nlp systems
- part-of-speech tagger
- part-of-speech tagging
- predictor
- processing
- single classifier
- smoothing
- smoothing method
- smoothing technique
- tagger
- taggers
- tagging
- terminology
- trigram tagger
- validation
- viterbi
- viterbi algorithm
- voting
- weighting
Other assigned terms:
- 10-fold cross validation
- affix
- affixation
- alphabet
- approach
- bigram
- brown corpus
- case
- characters
- contextual information
- corpora
- distribution
- entropy
- estimation
- experimental results
- fact
- feature
- feature information
- first-order model
- knowledge
- lexical information
- lexicon
- markov models
- measure
- method
- n-gram
- natural language
- part of speech
- part-of-speech
- part-of-speech information
- penn treebank
- probabilities
- probability
- probability distribution
- probability distributions
- probability estimates
- process
- research topic
- running time
- sentence
- sentences
- sparse data
- sparse data problem
- statistical data
- statistical model
- statistics
- suffix
- suffixes
- symbol
- symbols
- syntactic categories
- tag sequence
- tagging model
- tags
- technique
- term
- test set
- testing data
- training
- training and testing data
- training corpus
- training data
- transition probabilities
- transition probability
- treebank
- trigram
- trigram model
- unigram
- verb
- wall street journal corpus
- word
- words
- wsj corpus