ACL RD-TEC 1.0 Summarization of N04-2007
Paper Title:
A PRELIMINARY LOOK INTO THE USE OF NAMED ENTITY INFORMATION FOR BIOSCIENCE TEXT TOKENIZATION
A PRELIMINARY LOOK INTO THE USE OF NAMED ENTITY INFORMATION FOR BIOSCIENCE TEXT TOKENIZATION
Primarily assigned technology terms:
- analyzer
- bioscience text normalization
- classification
- cross-validation
- decision tree
- dictionary lookup
- document retrieval
- entity tagging
- error analysis
- feature selection
- feedback engine
- hidden markov
- hidden markov models
- information extraction
- learning
- learning methods
- machine learning
- machine learning methods
- morphological analyzer
- name extraction
- name tagging
- named entity tagging
- normalization
- noun phrase extraction
- parsing
- phrase extraction
- pipelining
- pipelining approach
- protein name extraction
- protein name tagging
- pruning
- query expansion
- relevance feedback engine
- searching
- significance testing
- tagging
- terminology
- text normalization
- text retrieval
- tokenization
- tokenization system
- tokenizer
- weka
Other assigned terms:
- 10-fold cross-validation
- acronym
- ambiguous punctuation
- approach
- biology
- bioscience text
- break
- case
- characters
- classification problem
- dictionary
- distribution
- document
- f-measure
- feature
- feature vectors
- genia
- genia corpus
- implementation
- information gain
- knowledge
- labeling
- majority class baseline
- markov models
- measures
- medline
- methodology
- named entities
- named entity
- names
- noun phrase
- ontology
- orthography
- part-of-speech
- parts of speech
- phrase
- precision
- proper names
- punctuation
- queries
- query
- sentence
- sentence boundary
- tags
- technical terminology
- term
- terms
- test set
- testing set
- text
- tokens
- tree
- words