ACL RD-TEC 1.0 Summarization of W03-1724
Paper Title:
INTEGRATING NGRAM MODEL AND CASE-BASED LEARNING FOR CHINESE WORD SEGMENTATION
INTEGRATING NGRAM MODEL AND CASE-BASED LEARNING FOR CHINESE WORD SEGMENTATION
Authors: Chunyu Kit and Zhiming Xu and Jonathan J. Webster
Primarily assigned technology terms:
- algorithm
- case-based learning
- chinese word segmentation
- chinese-english word alignment
- computing
- disambiguation
- dynamic programming
- em algorithm
- em training
- error analysis
- error correction
- error-driven learning
- example-based learning
- example-based learning approach
- language model training
- learning
- learning approach
- matching
- maximal matching
- model training
- normalization
- oov word detection
- parameter estimation
- probabilistic segmentation
- probabilistic word segmentation
- recognition
- reestimation
- rule extraction
- rule learning
- segmentation
- segmentation system
- transformation-based error-driven learning
- unsupervised training
- viterbi
- viterbi algorithm
- word alignment
- word detection
- word discovery
- word segmentation
- word segmentation bakeoff
- word segmentation system
Other assigned terms:
- ambiguity
- approach
- bias
- case
- chinese sentence
- chinese word
- context information
- convergence
- corpora
- estimation
- events
- f score
- implementation
- knowledge
- language model
- learning strategy
- local maxima
- ngram
- ngram model
- normalization factor
- precision
- probabilities
- probability
- regular expressions
- segmentation accuracy
- segmentation bakeoff
- sentence
- sentences
- system architecture
- terms
- test corpus
- test set
- training
- training corpora
- training corpus
- training data
- transformation
- transformation rule
- transformation rules
- word
- word count
- word segmentation accuracy
- word sequence
- words