ACL RD-TEC 1.0 Summarization of P06-2004

Paper Title:
THE EFFECT OF CORPUS SIZE IN COMBINING SUPERVISED AND UNSUPERVISED TRAINING FOR DISAMBIGUATION

Authors: Michaela Atterer and Hinrich Schütze

Other assigned terms:

  • adjective
  • ambiguity
  • analogy
  • annotated corpora
  • annotated corpus
  • approach
  • association for computational linguistics
  • attachment ambiguity
  • backoff
  • backoff model
  • bias
  • case
  • corpora
  • corpus size
  • data set
  • data sets
  • data structure
  • dependency structures
  • development set
  • disambiguation system
  • document
  • email
  • european monetary system
  • events
  • experimental results
  • fact
  • gold standard
  • head noun
  • hypernym
  • implementation
  • index
  • inverted index
  • knowledge
  • large training
  • lattice
  • lattices
  • linguistics
  • measure
  • method
  • modifier
  • mutual information
  • named entities
  • named entity
  • names
  • natural language
  • noise
  • nouns
  • parse
  • penn treebank
  • person names
  • phrase
  • phrase attachment
  • pointwise mutual information
  • pp attachment
  • preposition
  • prepositional phrase
  • prepositional phrase attachment
  • probabilities
  • procedure
  • proper noun
  • queries
  • query
  • rc attachment
  • relative clause
  • relative clause attachment
  • relative clauses
  • reuters corpus
  • sentence
  • sentences
  • set size
  • sparse data
  • statistics
  • subtree
  • subtrees
  • test set
  • text
  • topics
  • training
  • training set
  • treebank
  • unannotated corpora
  • unannotated corpus
  • unannotated text
  • unlabeled corpora
  • unlabeled corpus
  • verb
  • verb attachment
  • word
  • word count
  • words
  • wsj corpus

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***