ACL RD-TEC 1.0 Summarization of P06-1048

Paper Title:
MODELS FOR SENTENCE COMPRESSION: A COMPARISON ACROSS DOMAINS, TRAINING REQUIREMENTS AND EVALUATION MEASURES

Authors: James Clarke and Mirella Lapata

Other assigned terms:

  • annotated corpora
  • annotator
  • annotators
  • approach
  • appropriate compression model
  • association for computational linguistics
  • bleu
  • break
  • british national corpus
  • broadcast news
  • broadcast news corpus
  • case
  • comparative study
  • compression corpus
  • context-free grammar
  • corpora
  • corpus frequency
  • correlation
  • data set
  • data sets
  • debugging
  • decision-based model
  • decision-tree model
  • derivations
  • development cycle
  • distribution
  • document
  • document frequency
  • edit distance
  • estimation
  • evaluation data
  • evaluation measures
  • evaluation method
  • evaluation metric
  • evaluations
  • f-score
  • fact
  • feature
  • feature space
  • function words
  • generation
  • genre
  • gold standard
  • grammar
  • grammatical relations
  • grammatical-functional information
  • grammaticality
  • head word
  • human compression performance
  • human performance
  • implementation
  • information content
  • knowledge
  • lambda
  • language model
  • language modeling toolkit
  • language models
  • lexical items
  • likelihood
  • linguistic
  • linguistic knowledge
  • linguistics
  • measure
  • measures
  • mechanisms
  • method
  • model parameters
  • modeling toolkit
  • n-gram
  • natural language
  • news corpus
  • non-parallel corpus
  • nouns
  • parallel corpora
  • parallel corpus
  • parse
  • parse tree
  • parsing paradigm
  • part-of-speech
  • part-of-speech tags
  • penn treebank
  • pos tag
  • precision
  • probability
  • process
  • relation
  • reordering
  • scalability
  • semantic
  • sentence
  • sentence level
  • sentences
  • significance score
  • speech corpora
  • speech data
  • spoken language
  • string edit distance
  • style
  • syntactic constituent
  • syntactic constituents
  • syntactic trees
  • system development
  • tags
  • technique
  • television
  • term
  • terms
  • text
  • text corpus
  • tokens
  • toolkit
  • training
  • training data
  • training set
  • transcript
  • transcripts
  • translation quality
  • tree
  • treebank
  • trees
  • trigram
  • trigram language model
  • verb
  • vocabulary
  • vocabulary size
  • wilcoxon test
  • word
  • word level
  • word order
  • word-based model
  • words
  • ziff-davis corpus

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***