ACL RD-TEC 1.0 Summarization of W96-0102
Paper Title:
MBT: A MEMORY-BASED PART OF SPEECH TAGGER-GENERATOR
MBT: A MEMORY-BASED PART OF SPEECH TAGGER-GENERATOR
Authors: Walter Daelemans and Jakob Zavrel and Peter Berck and Steven Gillis
Primarily assigned technology terms:
- algorithm
- approximation
- automatic tagging
- capitalization
- case retrieval
- classification
- classifier
- computational linguistics
- computer vision
- computing
- conflict resolution
- corpus linguistics
- cross-validation
- data fusion
- decision trees
- decision-tree
- decision-tree learning
- disambiguation
- encoding
- error-driven approach
- example-based machine translation
- feature relevance weighting
- feature selection
- feature weighting
- hidden markov
- incremental learning
- indexing
- inductive learning
- information retrieval
- k-nn
- knowledge engineering
- learning
- learning approach
- learning approaches
- lexicon construction
- linguistic engineering
- machine learning
- machine learning approaches
- machine translation
- matching
- memory storage
- memory-based learning
- memory-based tagging
- modeling
- morphological analysis
- morphology
- nearest neighbors
- non-parametric estimation
- non-parametric statistical pattern recognition
- non-parametric statistical pattern recognition technique
- optimisation
- parser
- parsing
- part of speech tagging
- part-of-speech tagging
- pattern recognition
- pattern recognition technique
- pos tagger
- pos tagging
- preprocessing
- processing
- pruning
- reasoning
- recognition
- relevance ordering
- relevance weighting
- retrieving
- rule selection
- search
- searching
- selection mechanism
- semantic disambiguation
- semantic tagging
- sentence analysis
- smoothing
- smoothing techniques
- speech systems
- speech tagging
- spelling
- statistical approaches
- statistical methods
- statistical pattern recognition
- structural disambiguation
- supervised learning
- tagger
- tagger generation
- tagger generator
- taggers
- tagging
- ten-fold cross-validation
- terminology
- text to speech
- tree construction
- weighting
- weighting method
Other assigned terms:
- 10-fold cross-validation
- adverb
- annotated corpus
- approach
- case
- case information
- category label
- class information
- classification problem
- computational complexity
- computational phonology
- concept
- context feature
- context features
- context information
- context size
- context words
- corpora
- corpus size
- cross-validation experiment
- data set
- data sets
- distance metric
- distribution
- entropy
- estimation
- fact
- feature
- feature information
- feature value
- formalism
- function word
- function words
- generalisation
- generation
- generation process
- heuristic
- heuristics
- implementation
- index
- information entropy
- information gain
- information sources
- knowledge
- leaf
- lexical representation
- lexicon
- linguistic
- linguistics
- linguists
- machine translation research
- mapping
- maps
- measures
- method
- methodology
- morpheme
- morphemes
- morphological information
- morphological rules
- n-gram
- n-gram models
- natural language
- noise
- nonterminal
- part of speech
- part-of-speech
- probabilities
- probability
- procedure
- process
- pronoun
- query
- representations
- root node
- semantic
- semantic types
- sentence
- sentences
- set size
- similarity metric
- small training corpora
- sources of information
- speech tag
- standard deviation
- statistical approach
- statistics
- subtree
- suffix
- suffixes
- symbol
- syntactic categories
- syntactic category
- tag sequence
- tag set
- tagged corpora
- tagged corpus
- tagging accuracy
- tagging task
- tags
- technique
- terms
- test set
- text
- time complexity
- tokens
- training
- training corpora
- training material
- training set
- training set size
- translation research
- tree
- trees
- verb
- wall street journal corpus
- window size
- word
- word form
- word type
- word types
- words
- wsj corpus