ACL RD-TEC 1.0 Summarization of W03-1313
Paper Title:
ENCODING BIOMEDICAL RESOURCES IN TEI: THE CASE OF THE GENIA CORPUS
ENCODING BIOMEDICAL RESOURCES IN TEI: THE CASE OF THE GENIA CORPUS
Authors: Tomaz Erjavec and Jin-Dong Kim and Tomoko Ohta and Yuka Tateisi and Jun'ichi Tsujii
Primarily assigned technology terms:
- binding
- biotechnology
- classification
- concept identification
- consistency checking
- database
- databases
- electronic text encoding
- encoder
- encoding
- error correction
- identification
- illustration
- information extraction
- information retrieval\/extraction
- linguistic analysis
- linguistic processing
- linking
- markup language
- nlp
- nlp technology
- ontology description
- parser
- parsing
- processing
- prolog
- sampling
- searching
- segmentation
- semantic classification
- standardisation
- taggers
- tagging
- text encoding
- tokenisation
- transcription
- translation process
- validation
- xsl
Other assigned terms:
- abbreviations
- annotated corpora
- annotated corpus
- annotation
- annotation scheme
- biomedical corpora
- biomedical domain
- biotechnology information
- british national corpus
- case
- community
- concept
- concepts
- corpora
- description language
- document
- ellipsis
- exact meaning
- fact
- feature
- feature structures
- gene expression
- genia
- genia corpus
- knowledge
- language corpora
- language resources
- lemma
- lexica
- linguistic
- linguistic information
- mapping
- mark-up
- markup
- meaning
- mechanisms
- medline
- mesh
- meta-data
- named entities
- names
- nlp applications
- nouns
- ontologies
- ontology
- paragraph
- part-of-speech
- part-of-speech tags
- pos information
- procedure
- process
- punctuation
- punctuation marks
- segments
- semantic
- semantic class
- sentence
- sentences
- stems
- syntactic properties
- syntactic structure
- tagged corpora
- tags
- tagset
- taxonomy
- technical terms
- technologies
- technology
- tei standard
- term
- terms
- testing data
- text
- text encoding initiative
- text structure
- tokens
- training
- training and testing data
- transformation
- translation equivalents
- tree
- umls
- unified medical language
- user
- web site
- word
- words