ACL RD-TEC 1.0 Summarization of C04-1051
Paper Title:
UNSUPERVISED CONSTRUCTION OF LARGE PARAPHRASE CORPORA: EXPLOITING MASSIVELY PARALLEL NEWS SOURCES
UNSUPERVISED CONSTRUCTION OF LARGE PARAPHRASE CORPORA: EXPLOITING MASSIVELY PARALLEL NEWS SOURCES
Authors: Bill Dolan and Chris Quirk and Chris Brockett
Primarily assigned technology terms:
- algorithm
- alignment algorithm
- automatic alignment
- blind evaluation
- classifier
- clustering
- clustering algorithm
- data collection
- data extraction
- distance extraction
- edit distance extraction
- extraction technique
- giza
- heuristic strategy
- identification
- information retrieval
- learning
- levenshtein
- linking
- machine translation
- multiple sequence alignment
- paraphrase acquisition
- paraphrase identification
- paraphrase recognition
- phrasing
- question answering
- recognition
- sampling
- scoring
- scoring function
- sequence alignment
- spelling
- statistical machine translation
- summarization
- topicalization
- viterbi
- viterbi alignment
- word alignment
- word alignment algorithm
Other assigned terms:
- adverb
- alignment error rate
- alignment problem
- alignment task
- anaphor
- anaphora
- annotation
- annotators
- approach
- case
- characters
- chunks
- cluster
- clusters
- content words
- corpora
- data set
- data sets
- data type
- discourse
- discourse structure
- distance metric
- document
- edit distance
- electronic form
- error rate
- events
- generation
- genre
- gold standard
- heuristic
- ibm models
- implementation
- information content
- knowledge
- large corpora
- levenshtein distance
- lexical information
- lexical items
- linguistic
- linguistic information
- long distance dependencies
- mappings
- measures
- method
- methodology
- monolingual paraphrase
- noise
- parallel sentence
- parallelism
- paraphrase
- paraphrases
- parts of speech
- phrase
- polarity
- precision
- prepositional phrase
- priori
- process
- pronominal anaphora
- punctuation
- random sample
- reordering
- semantic
- semantic content
- semantic relatedness
- semantic roles
- sentence
- sentence pair
- sentences
- source sentence
- string edit distance
- string similarity
- synonym
- synonymy
- tagged corpus
- target word
- technique
- technology
- term
- terms
- test data
- test set
- text
- training
- training corpus
- training data
- training set
- translation models
- translations
- word
- word alignment task
- word count
- word order
- words