ACL RD-TEC 1.0 Summarization of I05-5001
Paper Title:
SUPPORT VECTOR MACHINES FOR PARAPHRASE IDENTIFICATION AND CORPUS CONSTRUCTION
SUPPORT VECTOR MACHINES FOR PARAPHRASE IDENTIFICATION AND CORPUS CONSTRUCTION
Authors: Chris Brockett and William B. Dolan
Primarily assigned technology terms:
- algorithm
- alignment algorithm
- bootstrap
- bootstrapping
- classification
- classifier
- classifier learning
- classifiers
- clustering
- corpus construction
- corpus extraction
- cross-validation
- cutoff
- database
- decision trees
- decoder
- document clustering
- giza
- hillclimbing
- identification
- kernels
- language classification
- learning
- learning algorithms
- learning techniques
- levenshtein
- machine learning
- machine learning algorithms
- machine learning techniques
- machine translation
- mining
- multidocument summarization
- multiple sequence alignment
- natural language classification
- non-application-specific corpus extraction
- normalization
- optimization
- paraphrase detection
- paraphrase identification
- paraphrase recognition
- processing
- question answering
- recognition
- reporting
- search
- sequence alignment
- smt system
- spelling
- stemmer
- summarization
- supervised machine learning
- support vector machines
- svm classifier
- tagging
- text classification
- validation
- word alignment
- world wide web
Other assigned terms:
- abbreviations
- alignment error rate
- anchors
- annotated corpus
- annotation
- annotators
- approach
- bilingual corpora
- bilingual parallel corpus
- bilingual sentence
- case
- classification tasks
- cluster
- clusters
- coherence
- community
- comparable corpora
- comparable corpus
- corpora
- corpus size
- correlation
- data set
- data sets
- distribution
- document
- document sets
- edit distance
- error rate
- evaluations
- fact
- feature
- feature set
- feature sets
- generation
- gold standard
- heuristic
- heuristics
- human annotators
- implementation
- intention
- inter-rater agreement
- interpretation
- large corpora
- levenshtein edit distance
- lexical feature
- lexical features
- lexicon
- likelihood
- log-likelihood
- mappings
- method
- methodology
- monolingual paraphrase
- morphological variant
- named entity
- natural language
- noise
- non-application-specific corpus
- parallel corpus
- parameter settings
- paraphrase
- paraphrase corpus
- paraphrases
- phrase
- precision
- probability
- procedure
- search space
- semantic
- sentence
- sentence pair
- sentences
- source language
- source language text
- substring
- support vector
- svms
- synonym
- synonyms
- synonymy
- target language
- technique
- technologies
- terms
- test set
- text
- theory
- training
- training corpus
- training data
- training set
- translations
- trees
- word
- word association
- word pair
- wordnet
- words