ACL RD-TEC 1.0 Summarization of W97-0304

Paper Title:
TEXT SEGMENTATION USING EXPONENTIAL MODELS

Authors: Doug Beeferman and Adam Berger and John Lafferty

Other assigned terms:

  • anchors
  • annotation
  • approach
  • backoff
  • bag of words
  • binary features
  • broadcast news
  • broadcast news corpus
  • cache
  • case
  • co-occurrence
  • cohesion
  • concepts
  • conditional probability
  • content words
  • continuous speech
  • conversation
  • corpora
  • cosine measure
  • data consortium
  • dictionary
  • discourse
  • discourse units
  • distribution
  • document
  • document length
  • edit distance
  • entropy
  • error metric
  • evaluation metric
  • events
  • experimental results
  • exponential distribution
  • exponential model
  • f-measure
  • fact
  • feature
  • feature-based approach
  • geometric mean
  • interpolation
  • knowledge
  • language model
  • language models
  • large corpora
  • large corpus
  • lexical cohesion
  • lexical cohesiveness
  • lexical features
  • likelihood
  • linear combination
  • linguistic
  • linguistic data
  • linguistic data consortium
  • linguistic features
  • log-linear model
  • measure
  • measures
  • method
  • model probability
  • mutual information
  • n-grams
  • natural language
  • news corpus
  • pairs of words
  • paragraph
  • paragraphs
  • pauses
  • personal pronoun
  • phrase
  • precision
  • probabilities
  • probability
  • probability distribution
  • probability distributions
  • procedure
  • process
  • pronoun
  • segment boundaries
  • segment boundary
  • segmentation problem
  • segments
  • semantic
  • semantic network
  • sentence
  • sentence boundaries
  • sentence level
  • sentences
  • size of the corpus
  • statistic
  • statistical approach
  • statistical framework
  • statistical model
  • statistics
  • string edit distance
  • style
  • symbol
  • target sentence
  • tdt corpus
  • technique
  • television
  • term
  • terms
  • test data
  • text
  • text corpora
  • text corpus
  • text segments
  • tokens
  • topics
  • training
  • training and test data
  • training data
  • training set
  • transcripts
  • tree
  • trees
  • trigram
  • trigram model
  • uniform distribution
  • user
  • utterance
  • vocabulary
  • wall street journal corpus
  • word
  • word corpus
  • word repetition
  • words
  • wsj corpus

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***