ACL RD-TEC 1.0 Summarization of W03-1719
Paper Title:
THE FIRST INTERNATIONAL CHINESE WORD SEGMENTATION BAKEOFF
THE FIRST INTERNATIONAL CHINESE WORD SEGMENTATION BAKEOFF
Authors: Richard Sproat and Thomas Emerson
Primarily assigned technology terms:
- algorithm
- chinese language processing
- chinese word segmentation
- chinese-english machine translation
- coding
- cross-language information retrieval
- encoding
- grammatical analysis
- information retrieval
- language processing
- machine translation
- matching
- matching algorithm
- maximum matching
- perl script
- processing
- reading
- scoring
- search
- search engine
- search engines
- segmentation
- segmenter
- synthesis
- text-to-speech
- text-to-speech synthesis
- tuning
- unknown-word detection
- word segmentation
- word segmentation bakeoff
Other assigned terms:
- abbreviations
- adjective
- approach
- binomial distribution
- case
- characters
- chinese language
- chinese text
- chinese treebank
- chinese word
- city university corpus
- coding scheme
- community
- corpora
- ctb corpus
- dictionaries
- dictionary
- distribution
- evaluations
- fact
- generation
- large corpora
- measure
- natural language
- open test
- penn chinese treebank
- precision
- probability
- process
- queries
- query
- segmentation bakeoff
- sinica corpus
- statistical significance
- statistics
- suffixes
- test corpora
- test corpus
- test data
- test set
- testing data
- text
- theorem
- training
- training and test data
- training corpora
- training corpus
- training data
- training material
- training set
- treebank
- word
- word count
- words