ACL RD-TEC 1.0 Summarization of E06-1021
Paper Title:
TOWARDS ROBUST CONTEXT-SENSITIVE SENTENCE ALIGNMENT FOR MONOLINGUAL CORPORA
TOWARDS ROBUST CONTEXT-SENSITIVE SENTENCE ALIGNMENT FOR MONOLINGUAL CORPORA
Authors: Rani Nelken and Stuart M. Shieber
Primarily assigned technology terms:
- algorithm
- alignment algorithm
- alignment method
- bilingual alignment
- biological sequence analysis
- clustering
- decomposition
- decomposition method
- document alignment
- dynamic programming
- dynamic programming algorithm
- feature selection
- feature selection algorithm
- giza
- hidden markov
- hidden markov model
- idf scoring
- information retrieval
- levenshtein
- logistic regression
- markov model
- matching
- monolingual sentence alignment
- paragraph clustering
- paraphrasing
- porter stemming
- programming algorithm
- ranking
- recognizing textual entailment
- regression
- scoring
- searching
- selection algorithm
- sentence alignment
- sentence ordering
- sequence alignment
- sequence analysis
- spelling
- splitting
- summarization
- text rewriting
- weighting
- weka
- word-alignment
Other assigned terms:
- aligned document
- annotation
- approach
- biological sequence
- case
- characters
- cluster
- clusters
- content words
- corpora
- cosine measure
- cosine similarity
- cosine similarity measure
- distance score
- distribution
- document
- document context
- document sets
- encyclopedia
- entailment
- f-measure
- fact
- feature
- heuristics
- human annotation
- ibm model
- implementation
- large corpus
- levenshtein distance
- lexicon
- linear order
- logistic regression model
- mapping
- mapping rules
- mappings
- measure
- measures
- method
- monolingual corpora
- noun phrases
- nouns
- paragraph
- paragraphs
- paraphrase
- paraphrase corpus
- precision
- probabilities
- probability
- probability distribution
- process
- regression model
- relation
- scalability
- scoring scheme
- segments
- semantic
- semantic classes
- sentence
- sentence pair
- sentence similarity
- sentences
- similarity measure
- similarity score
- similarity scores
- synonymy
- term
- terms
- testing set
- text
- textual entailment
- tf \* idf
- training
- training data
- training document
- training documents
- training set
- training text
- transformation
- transition probabilities
- word
- wordnet
- words