ACL RD-TEC 1.0 Summarization of P06-2056
Paper Title:
UNSUPERVISED SEGMENTATION OF CHINESE TEXT BY USE OF BRANCHING ENTROPY
UNSUPERVISED SEGMENTATION OF CHINESE TEXT BY USE OF BRANCHING ENTROPY
Authors: Zhihui Jin and Kumiko Tanaka-Ishii
Primarily assigned technology terms:
- algorithm
- boundary detection
- chinese segmentation
- chinese word segmentation
- computational linguistics
- error analysis
- human language
- language engineering
- learning
- measuring
- monitoring
- preprocessing
- processing
- segmentation
- segmentation method
- supervised segmentation
- text segmentation
- unsupervised method
- unsupervised segmentation
- word extraction
- word segmentation
Other assigned terms:
- association for computational linguistics
- case
- characters
- chinese characters
- chinese corpus
- chinese language
- chinese text
- chinese word
- chinese words
- chunk
- computational complexity
- concept
- corpora
- corpus size
- entropy
- evaluation function
- fact
- formalization
- input text
- intention
- japanese ideogram
- language data
- linguistic
- linguistic structure
- linguistics
- local maximum
- measure
- measures
- method
- morpheme
- morpheme boundary
- morphemes
- mutual information
- n-gram
- n-grams
- ngram
- phrase
- precision
- punctuation
- segmentation problem
- segments
- semantic
- sentence
- sentences
- substring
- target language
- term
- terms
- test corpus
- test data
- text
- tokens
- training
- training corpus
- training data
- word
- word boundaries
- word boundary
- words