ACL RD-TEC 1.0 Summarization of C04-1140
Paper Title:
HIGH-PERFORMANCE TAGGING ON MEDICAL TEXTS
HIGH-PERFORMANCE TAGGING ON MEDICAL TEXTS
Authors: Udo Hahn and Joachim Wermter
Primarily assigned technology terms:
- algorithm
- brill tagger
- data-driven tagging
- high-performance tagging
- human language
- human language technology
- language processing
- language technology
- learning
- linear interpolation
- linguistic analysis
- measuring
- nlp
- parameterization
- parsing
- partitioning
- pos tagger
- processing
- ranking
- re-training
- recognition
- rule-based tagger
- smoothing
- statistical tagger
- tagger
- taggers
- tagging
- terminology
- tnt tagger
- viterbi
- viterbi algorithm
Other assigned terms:
- abbreviations
- adjective
- annotated corpus
- annotation
- annotation effort
- annotators
- bigram
- biology
- break
- case
- co-occurrences
- corpora
- distribution
- document
- document collection
- document structure
- fact
- feature
- genre
- german language
- gold standard
- grammar
- hypothesis
- interpolation
- interpretation
- knowledge
- language corpora
- language data
- language model
- language resources
- lexicon
- linguistic
- linguistic data
- manual annotation
- markov models
- measure
- measures
- medical corpora
- medical corpus
- medical terminology
- n-gram
- n-grams
- negra
- negra corpus
- news corpus
- newspaper corpus
- newspaper language
- noun phrase
- null hypothesis
- parse
- part-of-speech
- pathology
- penn treebank
- phrase
- portability
- pos category
- pos tag
- priori
- probability
- probability distribution
- procedure
- process
- random sample
- sentences
- similarity measures
- specialist lexicon
- standard deviation
- statistical model
- statistics
- sublanguage
- suffix
- tagger lexicon
- tagging accuracy
- tagging performance
- tags
- tagset
- technologies
- technology
- terms
- test data
- test set
- text
- text corpus
- textbook
- tokens
- training
- training corpora
- training set
- training size
- treebank
- trees
- trigram
- understanding
- unigram
- vocabulary
- word
- word types
- words