ACL RD-TEC 1.0 Summarization of W03-1723
Paper Title:
A TWO-STAGE STATISTICAL WORD SEGMENTATION SYSTEM FOR CHINESE
A TWO-STAGE STATISTICAL WORD SEGMENTATION SYSTEM FOR CHINESE
Authors: Guohong Fu and Kang-Kwong Luke
Primarily assigned technology terms:
- algorithm
- ambiguity resolution
- chinese language processing
- chinese unknown word identification
- chinese word segmentation
- computational linguistics
- disambiguation
- extractor
- identification
- interpolation technique
- known word segmentation
- language processing
- likelihood estimation
- linear interpolation
- maximum likelihood
- maximum likelihood estimation
- normalization
- processing
- segmentation
- segmentation system
- unknown word identification
- viterbi
- viterbi algorithm
- word bigram
- word identification
- word segmentation
- word segmentation bakeoff
- word segmentation system
- word-based word-formation
- word-formation
Other assigned terms:
- abbreviation
- ambiguity
- approach
- bigram
- bigram model
- characters
- chinese language
- chinese text
- chinese word
- contextual information
- contextual word
- corpora
- data sparseness
- dictionary
- estimation
- evaluations
- experimental results
- f-measure
- f-score
- interpolation
- language models
- lattice
- lexicon
- likelihood
- linguistics
- measures
- open test
- precision
- probabilities
- probability
- process
- segmentation bakeoff
- segmented corpus
- sentence
- suffix
- technique
- test corpus
- text
- training
- training corpus
- training data
- word
- word bigram model
- word boundaries
- word boundary
- word pair
- word string
- words