ACL RD-TEC 1.0 Summarization of C94-2198
Paper Title:
WORD CLASS DISCOVERY FOR POSTPROCESSING CHINESE HANDWRITING RECOGNITION
WORD CLASS DISCOVERY FOR POSTPROCESSING CHINESE HANDWRITING RECOGNITION
Primarily assigned technology terms:
- algorithm
- annealing algorithm
- character recognition
- chinese character recognition
- chinese language modeling
- chinese word segmentation
- class assignment
- class discovery
- classification
- combinatorial optimization
- database
- decoding
- dictionary look-up
- handwriting recognition
- identification
- japanese character recognition
- language modeling
- linguistic decoding
- metropolis algorithm
- modeling
- on-line recognition
- optimization
- pattern recognition
- postprocessing
- processing
- recognition
- recognition systems
- recognizer
- segmentation
- simulated annealing
- smoothing
- speech recognition
- speech recognizer
- training process
- word bigram
- word class discovery
- word classification
- word identification
- word segmentation
Other assigned terms:
- approach
- bigram
- bigram language model
- bigram model
- character bigram model
- characters
- chinese characters
- chinese language
- chinese word
- clusters
- collocation
- corpora
- data sparseness
- dictionary
- duration
- error rate
- experimental results
- french
- grammar
- handwriting
- interpolation
- language model
- language models
- large corpora
- large text corpora
- lattice
- lexical rules
- linguistic
- morphological features
- n-gram
- n-gram model
- n-gram models
- optimization problem
- parts-of-speech
- perplexity
- probabilistic language model
- probabilities
- probability
- procedure
- process
- recognition error rate
- recognition errors
- semantic
- semantic categories
- sentences
- similarity score
- subcorpus
- technology
- terms
- test data
- text
- text corpora
- text corpus
- training
- training corpus
- training data
- training set
- training text
- unigram
- unigram language model
- vocabulary
- word
- word classes
- word collocation
- word frequencies
- word lattice
- word sequence
- words