ACL RD-TEC 1.0 Summarization of W03-1026
Paper Title:
HOWTOGETACHINESENAME(ENTITY): SEGMENTATION AND COMBINATION ISSUES
HOWTOGETACHINESENAME(ENTITY): SEGMENTATION AND COMBINATION ISSUES
Authors: Hongyan Jing and Radu Florian and Xiaoqiang Luo and Tong Zhang and Abraham Ittycheriah
Primarily assigned technology terms:
- algorithm
- automatic speech recognition
- bracketing
- broadcast information service
- capitalization
- chinese named entity recognition
- chinese word segmentation
- chunking
- classification
- classification method
- classifier
- classifier combination
- classifier stacking
- classifiers
- data preparation
- decision trees
- decoding
- disambiguation
- dynamic programming
- dynamic programming approach
- encoding
- entity recognition
- entity recognition system
- entropy classifier
- hidden markov
- hidden markov model
- hmm system
- identification
- information extraction
- information retrieval
- japanese ne recognition
- knowledge discovery
- learning
- linear interpolation
- markov model
- maxent
- maximum entropy
- maximum entropy classifier
- maximum entropy model
- modeling
- named entity recognition
- ne recognition
- ne recognition system
- ne system
- ne tagging
- nlp
- noun phrase chunking
- parsing
- part-of-speech tagging
- phrase chunking
- probability interpolation
- recognition
- recognition system
- risk minimization
- robust risk minimization
- segmentation
- segmenter
- sense disambiguation
- sequence classification
- shallow parsing
- speech recognition
- syntactic bracketing
- tagging
- text chunking
- transformation-based learning
- truncation
- viterbi
- viterbi algorithm
- voting
- weighted voting
- winnow method
- word segmentation
- word segmenter
- word sense disambiguation
Other assigned terms:
- annotated corpus
- annotator
- approach
- base noun
- base noun phrase
- bigram
- broadcast news
- case
- characters
- chinese characters
- chinese corpora
- chinese text
- chinese treebank
- chinese word
- chunk
- chunking model
- class probability
- class-based model
- classification task
- conditional probability
- conditional probability model
- context window
- corpora
- data consortium
- data sets
- dictionaries
- distribution
- duration
- encoding scheme
- entity types
- entropy
- error rate
- estimation
- evaluation measure
- evaluation set
- experimental results
- f-measure
- fact
- feature
- feature type
- feature vector
- fmeasure
- foreign-name
- hmm model
- information sources
- interpolation
- japanese ne
- knowledge
- language model
- lexical features
- likelihood
- linguistic
- linguistic data
- linguistic data consortium
- local context
- measure
- measures
- method
- named entity
- names
- nlp applications
- noun phrase
- oracle
- organization names
- part-of-speech
- part-of-speech tags
- person names
- phrase
- precision
- probabilities
- probability
- probability distribution
- probability distributions
- probability model
- procedure
- programming approach
- recognition task
- relation
- sentence
- statistical language model
- subtree
- subtrees
- system performance
- tagging model
- tagging problem
- tags
- technique
- technology
- test data
- test set
- text
- text chunk
- textual information
- tokens
- toolkit
- topics
- training
- training data
- training examples
- training set
- training size
- training time
- transcriptions
- transcripts
- transition probabilities
- translations
- tree
- treebank
- trees
- trigram
- trigram model
- unigram
- weight vector
- word
- word boundaries
- word sense
- word trigram
- word-based model
- words