ACL RD-TEC 1.0 Summarization of W96-0102

Paper Title:
MBT: A MEMORY-BASED PART OF SPEECH TAGGER-GENERATOR

Authors: Walter Daelemans and Jakob Zavrel and Peter Berck and Steven Gillis

Primarily assigned technology terms:

Other assigned terms:

  • 10-fold cross-validation
  • adverb
  • annotated corpus
  • approach
  • case
  • case information
  • category label
  • class information
  • classification problem
  • computational complexity
  • computational phonology
  • concept
  • context feature
  • context features
  • context information
  • context size
  • context words
  • corpora
  • corpus size
  • cross-validation experiment
  • data set
  • data sets
  • distance metric
  • distribution
  • entropy
  • estimation
  • fact
  • feature
  • feature information
  • feature value
  • formalism
  • function word
  • function words
  • generalisation
  • generation
  • generation process
  • heuristic
  • heuristics
  • implementation
  • index
  • information entropy
  • information gain
  • information sources
  • knowledge
  • leaf
  • lexical representation
  • lexicon
  • linguistic
  • linguistics
  • linguists
  • machine translation research
  • mapping
  • maps
  • measures
  • method
  • methodology
  • morpheme
  • morphemes
  • morphological information
  • morphological rules
  • n-gram
  • n-gram models
  • natural language
  • noise
  • nonterminal
  • part of speech
  • part-of-speech
  • probabilities
  • probability
  • procedure
  • process
  • pronoun
  • query
  • representations
  • root node
  • semantic
  • semantic types
  • sentence
  • sentences
  • set size
  • similarity metric
  • small training corpora
  • sources of information
  • speech tag
  • standard deviation
  • statistical approach
  • statistics
  • subtree
  • suffix
  • suffixes
  • symbol
  • syntactic categories
  • syntactic category
  • tag sequence
  • tag set
  • tagged corpora
  • tagged corpus
  • tagging accuracy
  • tagging task
  • tags
  • technique
  • terms
  • test set
  • text
  • time complexity
  • tokens
  • training
  • training corpora
  • training material
  • training set
  • training set size
  • translation research
  • tree
  • trees
  • verb
  • wall street journal corpus
  • window size
  • word
  • word form
  • word type
  • word types
  • words
  • wsj corpus

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***