ACL RD-TEC 1.0 Summarization of W03-0504

Paper Title:
SUMMARIZATION OF NOISY DOCUMENTS: A PILOT STUDY

Authors: Hongyan Jing and Daniel Lopresti and Chilin Shih

Other assigned terms:

  • approach
  • bigram
  • boundary information
  • break
  • broadcast news
  • broadcast news speech
  • case
  • characters
  • co-reference
  • cohesion
  • complete parse
  • confidence score
  • confidence scores
  • contextual information
  • cue phrases
  • dictionary
  • document
  • document layout
  • document set
  • document vectors
  • duration
  • duration information
  • edit distance
  • email
  • english text
  • estimation
  • experimental results
  • generation
  • good-turing estimation
  • grammar
  • heuristic
  • heuristic rules
  • index
  • information sources
  • input text
  • intention
  • knowledge
  • language models
  • large corpus
  • lexical cohesion
  • likelihood
  • linguistic
  • main verb
  • mappings
  • measure
  • measures
  • method
  • n-gram
  • named entity
  • natural language
  • noise
  • noise rate
  • noisy input
  • noun phrases
  • nouns
  • ocr performance
  • opinion
  • paragraphs
  • parse
  • parse tree
  • part-of-speech
  • part-of-speech tag
  • pause
  • pause duration
  • phrase
  • precision
  • probability
  • process
  • punctuation
  • punctuation marks
  • query
  • recognition errors
  • sentence
  • sentence boundaries
  • sentence boundary
  • sentence level
  • sentences
  • slot
  • sources of information
  • sparse data
  • speech recognition errors
  • statistics
  • symbol
  • symbols
  • syntactic information
  • synthetic noise
  • synthetic noise rate
  • technique
  • technology
  • test set
  • text
  • text documents
  • tokens
  • training
  • transcripts
  • trec corpus
  • tree
  • trees
  • trigram
  • understanding
  • unigram
  • user
  • user interaction
  • verb
  • word
  • word error rates
  • word frequency
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***