ACL RD-TEC 1.0 Summarization of J01-2002
Paper Title:
IMPROVING ACCURACY IN WORD CLASS TAGGING THROUGH THE COMBINATION OF MACHINE LEARNING SYSTEMS
IMPROVING ACCURACY IN WORD CLASS TAGGING THROUGH THE COMBINATION OF MACHINE LEARNING SYSTEMS
Authors: Hans van Halteren and Walter Daelemans and Jakub Zavrel
Primarily assigned technology terms:
- abstracting
- adaboost
- algorithm
- approximation
- automatic tagging
- bagging
- beam search
- boosting
- bootstrapping
- capitalization
- categorization
- class tagging
- classification
- classification algorithm
- classifier
- classifier combination
- classifier learning
- classifiers
- coding
- combined classifier
- computational linguistics
- corpus annotation
- cross-validation
- data preparation
- decision tree
- decision tree learning
- decision trees
- disambiguation
- encoding
- error correcting
- error rate reduction
- error reduction
- finite-state machine
- grouped voting
- hidden markov
- hidden markov model
- hidden markov models
- hmm tagger
- identification
- induction
- information retrieval
- iterative scaling
- language processing
- language understanding
- learner
- learning
- learning algorithm
- learning algorithms
- learning framework
- learning method
- learning methods
- learning system
- listing
- machine learning
- machine learning algorithms
- machine learning methods
- markov model
- matching
- maximum entropy
- maximum entropy model
- maximum entropy system
- memory-based learner
- memory-based learning
- modeling
- naive bayes
- natural language processing
- nearest neighbors
- neural networks
- nlp
- numerical optimization
- numerical optimization method
- optimization
- optimization method
- output coding
- pairwise voting
- parsers
- parsing
- part-of-speech tagging
- partitioning
- pos tagging
- probabilistic voting
- processing
- rate reduction
- reasoning
- recognition
- resampling
- scoring
- search
- sense disambiguation
- single classifier
- spelling
- spelling correction
- splitting
- statistical nlp
- statistical parsers
- suffix tree
- supertagging
- tagger
- tagger generator
- taggers
- tagging
- tagging system
- text categorization
- transformation-based learning
- tree construction
- tree learning
- trigram tagger
- tuning
- unsupervised training
- viterbi
- viterbi algorithm
- voting
- voting classification
- voting mechanism
- voting system
- weighted voting
- weighting
- word class tagging
- word sense disambiguation
- wsj tagging
Other assigned terms:
- adjective
- adverb
- ambiguity
- ambiguous word
- ambiguous words
- american english
- annotated corpora
- annotated corpus
- annotation
- annotator
- annotators
- approach
- beam
- benchmark
- bias
- binary features
- british english
- case
- characters
- class distribution
- classification model
- classification task
- co-occurrence
- compound words
- compounding
- context features
- context information
- contextual information
- corpora
- cpu time
- data set
- data sets
- derivation
- determiner
- distribution
- dutch
- eindhoven corpus
- english text
- entropy
- error rate
- estimation
- exponential model
- fact
- feature
- feature set
- feature value
- feature-value pair
- fixed-length vector
- formalism
- grammar
- heuristics
- human annotators
- hypothesis
- implementation
- index
- information gain
- information sources
- knowledge
- lancaster-oslo\/bergen corpus
- language knowledge
- language model
- language models
- language processing tasks
- leaf
- lexical features
- lexical information
- lexical items
- lexicon
- lexicon entry
- linguistic
- linguistic knowledge
- linguistics
- lob corpus
- mapping
- markov models
- measure
- measures
- method
- n-gram
- names
- natural language
- natural language processing tasks
- natural language text
- nlp applications
- nlp task
- nlp tasks
- nouns
- opinion
- parameter settings
- parse
- part-of-speech
- partial parse
- particle
- past participle
- penn treebank
- penn treebank ii
- penn treebank tagset
- phrase
- phrase attachment
- precision
- preposition
- prepositional phrase
- prepositional phrase attachment
- priori
- probabilistic model
- probabilities
- probability
- probability distribution
- probability distributions
- probability model
- probability tag sequence
- process
- processing tasks
- projection
- proper noun
- punctuation
- relation
- representations
- rule sequence
- run-time
- search space
- sentence
- set size
- similarity metric
- sources of information
- sparse data
- sparse data problem
- statistical significance
- statistics
- stems
- subcorpus
- suffix
- tag information
- tag sequence
- tagged corpora
- tagger combination
- tagging accuracy
- tagging problem
- tagging task
- tags
- tagset
- technique
- term
- terms
- test material
- test set
- text
- tokens
- training
- training corpus
- training data
- training examples
- training material
- training phase
- training set
- training set size
- transformation
- transformation rules
- transition probabilities
- transitivity
- tree
- treebank
- trees
- trigram
- understanding
- unigram
- utterance
- verb
- word
- word classes
- word compounding
- word information
- word sense
- words