ACL RD-TEC 1.0 Summarization of W98-1118
Paper Title:
EXPLOITING DIVERSE KNOWLEDGE SOURCES VIA MAXIMUM ENTROPY IN NAMED ENTITY RECOGNITION
EXPLOITING DIVERSE KNOWLEDGE SOURCES VIA MAXIMUM ENTROPY IN NAMED ENTITY RECOGNITION
Authors: Andrew Borthwick and John Sterling and Eugene Agichtein and Ralph Grishman
Primarily assigned technology terms:
- algorithm
- c + +
- capitalization
- computational linguistics
- decoding
- entity recognition
- entropy estimation
- entropy training
- estimation procedure
- estimation process
- evaluation system
- feature selection
- feature selection algorithm
- language modeling
- language technology
- machine translation
- maximum entity
- maximum entropy
- maximum entropy system
- modeling
- multi-system integration
- named entity recognition
- named-entity recognition
- ne system
- parsing
- part-of-speech tagging
- pre-processing
- processing
- recognition
- recognition system
- reference resolution
- search
- selection algorithm
- sentence-boundary detection
- statistical system
- statistical techniques
- system integration
- tagger
- taggers
- tagging
- tokenization
- tokenizer
- training algorithm
- viterbi
- viterbi algorithm
- viterbi search
Other assigned terms:
- annotators
- approach
- bigram
- bigram language model
- binary feature
- binary features
- break
- case
- conditional probabilities
- conditional probability
- corpora
- dictionaries
- dictionary
- english text
- entropy
- entropy formulation
- entropy models
- estimation
- external knowledge
- external knowledge source
- f-measure
- fact
- feature
- heuristic
- human intervention
- human performance
- hypothesis
- index
- information sources
- japanese ne
- joint probability
- knowledge
- language model
- lattice
- lexical context
- lexical feature
- lexical features
- linguistic
- linguistic intuition
- linguistics
- maximum entropy models
- measures
- method
- model size
- named entities
- named entity
- named-entity
- names
- organization names
- pairs of words
- part-of-speech
- phrase
- portability
- probabilities
- probability
- procedure
- process
- proper name
- query
- run-time
- run-time performance
- sentences
- suffixes
- system architecture
- system performance
- tags
- technique
- technology
- term
- terms
- test corpora
- test corpus
- test data
- test set
- text
- theory
- tokens
- toolkit
- training
- training corpora
- training corpus
- training data
- training material
- user
- vocabulary
- word
- words
- wrapper