ACL RD-TEC 1.0 Summarization of P99-1036
Paper Title:
A PART OF SPEECH ESTIMATION METHOD FOR JAPANESE UNKNOWN WORDS USING A STATISTICAL MODEL OF MORPHOLOGY AND CONTEXT
A PART OF SPEECH ESTIMATION METHOD FOR JAPANESE UNKNOWN WORDS USING A STATISTICAL MODEL OF MORPHOLOGY AND CONTEXT
Primarily assigned technology terms:
- algorithm
- approximation
- computer science
- computing
- decomposition
- dictionary lookup
- dynamic programming
- error correction
- estimation method
- hidden markov
- hidden markov model
- japanese word segmentation
- markov model
- maximum entropy
- maximum entropy method
- modeling
- morphology
- ocr error correction
- part of speech tagging
- search
- search algorithm
- segmentation
- segmenter
- smoothing
- smoothing techniques
- speech tagging
- spelling
- tagging
- word bigram
- word segmentation
- word segmentation task
- word segmenter
Other assigned terms:
- alphabet
- approach
- baseline model
- bigram
- bigram model
- case
- character bigram model
- character sequence
- character type
- characters
- chinese characters
- cross entropy
- data sparseness
- dictionaries
- dictionary
- distribution
- dynamic programming procedure
- edr corpus
- entropy
- estimation
- events
- f-measure
- fact
- feature-based approach
- function words
- grammatical function
- hapax legomena
- information source
- japanese corpus
- japanese sentences
- japanese word
- joint probability
- kanji
- katakana
- knowledge
- language model
- length distribution
- linguistic
- measures
- method
- morphological knowledge
- ngram
- noun phrases
- nouns
- orthography
- out-of-vocabulary rate
- part of speech
- particles
- perplexity
- poisson distribution
- precision
- prediction accuracy
- probabilistic framework
- probabilities
- probability
- procedure
- pronunciation
- punctuation
- punctuation marks
- relative frequency
- roman alphabet
- segmentation accuracy
- segmented corpus
- sentence
- sentences
- speech information
- spelling model
- statistical model
- substring
- suffix
- surface form
- symbol
- symbols
- tagged corpus
- tagging accuracy
- tagging precision
- tags
- technical terms
- term
- terms
- test data
- test set
- tokens
- training
- training and test data
- training corpus
- training data
- training set
- unigram
- unigram model
- unknown word model
- verb
- word
- word bigram model
- word model
- word morphology
- word segmentation accuracy
- word sequence
- word type
- word-based language model
- words
- writing system