ACL RD-TEC 1.0 Summarization of C04-1051

Paper Title:
UNSUPERVISED CONSTRUCTION OF LARGE PARAPHRASE CORPORA: EXPLOITING MASSIVELY PARALLEL NEWS SOURCES

Authors: Bill Dolan and Chris Quirk and Chris Brockett

Other assigned terms:

  • adverb
  • alignment error rate
  • alignment problem
  • alignment task
  • anaphor
  • anaphora
  • annotation
  • annotators
  • approach
  • case
  • characters
  • chunks
  • cluster
  • clusters
  • content words
  • corpora
  • data set
  • data sets
  • data type
  • discourse
  • discourse structure
  • distance metric
  • document
  • edit distance
  • electronic form
  • email
  • error rate
  • events
  • generation
  • genre
  • gold standard
  • heuristic
  • ibm models
  • implementation
  • information content
  • knowledge
  • large corpora
  • levenshtein distance
  • lexical information
  • lexical items
  • linguistic
  • linguistic information
  • long distance dependencies
  • mappings
  • measures
  • method
  • methodology
  • monolingual paraphrase
  • noise
  • parallel sentence
  • parallelism
  • paraphrase
  • paraphrases
  • parts of speech
  • phrase
  • polarity
  • precision
  • prepositional phrase
  • priori
  • process
  • pronominal anaphora
  • punctuation
  • random sample
  • reordering
  • semantic
  • semantic content
  • semantic relatedness
  • semantic roles
  • sentence
  • sentence pair
  • sentences
  • source sentence
  • string edit distance
  • string similarity
  • synonym
  • synonymy
  • tagged corpus
  • target word
  • technique
  • technology
  • term
  • terms
  • test data
  • test set
  • text
  • training
  • training corpus
  • training data
  • training set
  • translation models
  • translations
  • word
  • word alignment task
  • word count
  • word order
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***