ACL RD-TEC 1.0 Summarization of C02-1101
Paper Title:
DETECTING ERRORS IN CORPORA USING SUPPORT VECTOR MACHINES
DETECTING ERRORS IN CORPORA USING SUPPORT VECTOR MACHINES
Authors: Tetsuji Nakagawa and Yuji Matsumoto
Primarily assigned technology terms:
- algorithm
- anomaly detection
- binary classification
- boosting
- classification
- corpus error detection
- corpus refinement
- corpus-based natural language processing
- error detection
- error detection and correction
- inner product
- japanese morphological analysis
- kernel
- language processing
- language processing systems
- learning
- learning algorithm
- learning method
- learning methods
- learning techniques
- machine learning
- machine learning algorithm
- machine learning methods
- machine learning techniques
- morphological analysis
- natural language processing
- natural language processing systems
- neural networks
- polynomial kernel
- pos tagger
- pos tagging
- postprocessing
- processing
- quadratic programming
- revision learning
- segmentation
- segmentation and pos tagging
- supervised machine learning
- support vector machines
- svm algorithm
- tagger
- taggers
- tagging
- tuning
- unsupervised learning
- word segmentation
- word segmentation and pos tagging
Other assigned terms:
- annotated corpora
- annotated corpus
- annotation
- annotators
- approach
- bigram
- case
- corpora
- corpus-based research
- distribution
- events
- feature
- feature vector
- inflection
- kernel function
- kyoto university corpus
- language processing tasks
- learning model
- learning problem
- likelihood
- method
- morpheme
- morphemes
- n-gram
- natural language
- natural language processing tasks
- part-of-speech
- penn treebank
- pos bigram
- pos tag
- positive and negative examples
- pp attachment
- precision
- probabilistic approach
- probabilities
- processing tasks
- sentence
- sentences
- statistical information
- stochastic model
- support vector
- svm model
- svms
- tagged corpus
- tags
- tokens
- training
- training data
- training example
- training examples
- treebank
- treebank wsj corpus
- trigram
- usability
- word
- words
- wsj corpus