ACL RD-TEC 1.0 Summarization of W00-0904
Paper Title:
COMPARISON BETWEEN TAGGED CORPORA FOR THE NAMED ENTITY TASK
COMPARISON BETWEEN TAGGED CORPORA FOR THE NAMED ENTITY TASK
Authors: Chikashi Nobata and Nigel Collier and Jun'ichi Tsujii
Primarily assigned technology terms:
- 5-fold cross validation
- 6-fold cross validation
- algorithm
- capitalization
- classification
- classifier
- corpus comparison
- cross validation
- cross validation method
- database
- decision tree
- decision trees
- extraction systems
- finite state
- finite state machines
- hidden markov
- hidden markov model
- hidden markov models
- hmms
- identification
- information extraction
- information extraction systems
- learning
- learning algorithms
- markov model
- maximum entropy
- maximumlikelihood
- name finding
- ne recognition
- pos tagger
- predictor
- recogniser
- recognition
- recognition systems
- scoring
- scoring program
- search
- smoothing
- smoothing techniques
- supervised learning
- tagger
- tagging
- term recognition
- terminology
- transcription
- validation
- viterbi
- viterbi algorithm
Other assigned terms:
- approach
- biology
- biology corpus
- case
- character type
- class probability
- corpora
- corpus size
- cross entropy
- data sparseness
- distribution
- entropy
- f-score
- f-score performance
- fact
- feature
- feature information
- feature set
- feature value
- frequency counts
- frequency list
- hxical information
- index
- infbrmation theory
- information gain
- knowledge
- leaf
- lexical information
- linear time
- markov models
- measure
- measures
- medline
- method
- name class
- named entities
- named entity
- named entity task
- ne task
- norm
- orthographic information
- part-of-speech
- part-of-speech information
- pennsylvania treebank
- pos information
- precision
- predictive power
- probabilities
- probability
- relative frequency
- statistics
- subtree
- symbols
- system performance
- tagged corpora
- tags
- term
- test corpora
- test corpus
- test set
- text
- theory
- token frequency
- tokens
- training
- training corpus
- training data
- training examples
- training set
- transition probabilities
- tree
- treebank
- trees
- vocabulary
- word
- word features
- word lists
- word sequence
- words