ACL RD-TEC 1.0 Summarization of W06-1632
Paper Title:
USING LINGUISTICALLY MOTIVATED FEATURES FOR PARAGRAPH BOUNDARY IDENTIFICATION
USING LINGUISTICALLY MOTIVATED FEATURES FOR PARAGRAPH BOUNDARY IDENTIFICATION
Authors: Katja Filippova and Michael Strube
Primarily assigned technology terms:
- algorithm
- binary classification
- boosting
- boundary identification
- boundary insertion
- categorization
- character recognition
- classification
- classifier
- classifiers
- computational linguistics
- database
- dependency parser
- encoding
- feature combination
- feature selection
- feature subset selection
- hill-climbing
- identification
- internet
- iterative algorithm
- language processing
- learner
- learning
- linking
- machinelearning
- matching
- memory-based learner
- multi-document summarization
- natural language processing
- optical character recognition
- optimization
- paragraph boundary identification
- paragraph segmentation
- parser
- preprocessing
- processing
- pronominalization
- ranking
- recognition
- recognition systems
- search
- segmentation
- selection algorithm
- splitting
- string matching
- style analysis
- summarization
- supervised learning
- tagger
- tagging
- text categorization
- text segmentation
- tnt tagger
- topic boundary identification
- topic segmentation
Other assigned terms:
- abbreviations
- anaphora
- anaphors
- anchor
- annotated corpus
- annotation
- approach
- association for computational linguistics
- binary classification problem
- binary feature
- break
- case
- classification problem
- coherence
- cohesion
- content words
- corpora
- cue words
- data set
- dependency trees
- development set
- discourse
- discourse connectives
- distribution
- document
- document structure
- domain-independence
- evaluation measure
- f-measure
- fact
- feature
- feature set
- feature space
- finite verb
- function words
- hypotheses
- hypothesis
- information structure
- interpretation
- language model
- lexical chains
- lexical cohesion
- linguistic
- linguistic features
- linguistics
- main clause
- measure
- named entities
- names
- natural language
- opinion
- paragraph
- paragraphs
- personal pronouns
- phrase
- precision
- presupposition
- process
- pronoun
- pronouns
- punctuation
- relation
- semantic
- semantic class
- sentence
- sentence boundaries
- sentences
- set size
- signal
- style
- syntactic features
- syntactic information
- tags
- test data
- text
- text cohesion
- tokens
- training
- training data
- training instance
- training set
- training size
- transcripts
- tree
- trees
- verb
- web documents
- wikipedia
- window size
- word
- word order
- words
- wrapper
- wsj corpus