ACL RD-TEC 1.0 Summarization of W04-3210

Paper Title:
AUTOMATIC PARAGRAPH IDENTIFICATION: A STUDY ACROSS LANGUAGES AND DOMAINS

Authors: Caroline Sporleder and Mirella Lapata

Other assigned terms:

  • abbreviation
  • anaphora
  • anaphora structure
  • anaphors
  • annotation
  • approach
  • authorship
  • authorship attribution
  • break
  • broadcast news
  • case
  • characters
  • chunks
  • classification accuracy
  • classification task
  • co-occurrence
  • coefficient
  • content words
  • corpora
  • cue words
  • data set
  • data sets
  • development set
  • device
  • dialogues
  • discourse
  • distribution
  • english corpus
  • entropy
  • error rate
  • europarl corpus
  • fact
  • feature
  • generation
  • german corpus
  • human performance
  • hypotheses
  • kappa
  • kappa coefficient
  • knowledge
  • language model
  • language models
  • leaf
  • manual annotation
  • measure
  • measures
  • method
  • n-gram
  • n-gram models
  • named entities
  • names
  • natural language
  • news corpus
  • orthography
  • paragraph
  • paragraph length
  • paragraphs
  • parse
  • parse tree
  • part-of-speech
  • part-of-speech tags
  • penn treebank
  • prediction task
  • probability
  • process
  • pronoun
  • punctuation
  • punctuation mark
  • punctuation marks
  • root node
  • segments
  • semantic
  • sentence
  • sentence boundaries
  • sentences
  • source language
  • stems
  • style
  • syntactic features
  • tags
  • target language
  • term
  • terms
  • test corpus
  • test set
  • text
  • toolkit
  • topics
  • training
  • training data
  • training examples
  • training set
  • training size
  • transcripts
  • tree
  • treebank
  • unigram
  • vocabulary
  • word
  • word co-occurrence
  • word features
  • word lists
  • word order
  • words
  • writing system
  • written texts

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***