ACL RD-TEC 1.0 Summarization of A00-2032
Paper Title:
MOSTLY-UNSUPERVISED STATISTICAL SEGMENTATION OF JAPANESE: APPLICATIONS TO KANJI
MOSTLY-UNSUPERVISED STATISTICAL SEGMENTATION OF JAPANESE: APPLICATIONS TO KANJI
Authors: Rie Kubota Ando and Lillian Lee
Primarily assigned technology terms:
- algorithm
- bootstrap
- bracketing
- chinese segmentation
- computing
- corresponding training
- data base
- database
- hidden markov
- hidden markov model
- japanese segmentation
- knowledge bases
- markov model
- memory allocation
- morphological analyzers
- mutual-information
- optimization
- parameter training
- postprocessing
- processing
- segmentation
- segmentation algorithm
- segmentation method
- segmenter
- statistical method
- terminology
- word segmentation
Other assigned terms:
- affixes
- annotation
- approach
- case
- characters
- cohesion
- compound words
- corpus size
- dictionaries
- discourse
- discourse entities
- f measure
- f-measure
- fact
- grammar
- grammar rules
- grammars
- heuristics
- information sources
- japanese sentences
- japanese text
- kanji
- knowledge
- lexical entries
- lexical knowledge
- lexicon
- local maximum
- local maximum condition
- meaning
- measure
- measures
- method
- morpheme
- morpheme level
- mutual information
- n-gram
- n-grams
- names
- nouns
- parameter settings
- part-of-speech
- part-of-speech information
- precision
- prefixes and suffixes
- segments
- sentences
- standard deviation
- statistical approach
- statistics
- stem
- stems
- suffix
- suffixes
- terms
- test corpus
- test data
- test set
- text
- topology
- training
- training corpus
- training data
- trigram
- word
- word boundary
- word level
- words