ACL RD-TEC 1.0 Summarization of J93-2004

Paper Title:
BUILDING A LARGE ANNOTATED CORPUS OF ENGLISH: THE PENN TREEBANK

Authors: Mitchell P. Marcus and Mary Ann Marcinkiewicz and Beatrice Santorini

Other assigned terms:

  • adjective
  • adverb
  • ambiguity
  • ambiguous words
  • american english
  • annotated corpora
  • annotated corpus
  • annotation
  • annotation scheme
  • annotation schemes
  • annotation task
  • annotator
  • annotators
  • approach
  • association for computational linguistics
  • attachment site
  • benchmark
  • brown corpus
  • case
  • chunk
  • chunks
  • composition
  • contextual information
  • corpora
  • data consortium
  • determiner
  • device
  • distribution
  • emacs editor
  • error rate
  • fact
  • feature
  • foreign word
  • genre
  • grammar
  • grammars
  • grammatical coverage
  • grammatical structure
  • human annotators
  • intention
  • labeling
  • large corpora
  • lexical item
  • lexical redundancy
  • lexicon
  • linguistic
  • linguistic context
  • linguistic data
  • linguistic data consortium
  • linguistic theory
  • linguistics
  • lisp
  • main verb
  • manual tagging
  • mapping
  • measure
  • measures
  • mechanisms
  • message
  • message understanding conference
  • methodology
  • modifier
  • morpheme
  • morphemes
  • muc-3
  • natural language
  • noun phrase
  • noun phrases
  • nouns
  • parallel corpus
  • parse
  • parse tree
  • parsed corpus
  • parser output
  • parsing models
  • part of speech
  • part-of-speech
  • particle
  • particles
  • past participle
  • penn treebank
  • penn treebank corpus
  • penn treebank project
  • penn treebank tagset
  • permutation
  • personal pronoun
  • personal pronouns
  • phrase
  • pos information
  • pos tag
  • pragmatic information
  • predeterminer
  • predicate-argument
  • predicate-argument structure
  • predicate-argument structures
  • predicates
  • preposition
  • prepositional phrase
  • prepositional phrases
  • prepositions
  • preprocessor
  • process
  • pronoun
  • pronouns
  • proper noun
  • punctuation
  • reflexive pronoun
  • relative clauses
  • representations
  • sbar
  • semantic
  • sentence
  • sentences
  • signal
  • skeletal syntactic structure
  • sparse data
  • spoken language
  • statistical models
  • subcorpus
  • subordinate clauses
  • symbol
  • symbols
  • syntactic categories
  • syntactic category
  • syntactic context
  • syntactic function
  • syntactic information
  • syntactic representation
  • syntactic structure
  • tagged corpus
  • tagged text
  • tagging task
  • tags
  • tagset
  • text
  • theoretical linguistics
  • theories
  • theory
  • tokens
  • topics
  • training
  • training material
  • transcripts
  • transitivity
  • translations
  • tree
  • tree structure
  • treebank
  • treebank corpus
  • treebank project
  • trees
  • unannotated text
  • understanding
  • verb
  • verb lexicon
  • vocabulary
  • wh-determiner
  • word
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***