ACL RD-TEC 1.0 Summarization of N03-2003

Paper Title:
GETTING MORE MILEAGE FROM WEB TEXT SOURCES FOR CONVERSATIONAL SPEECH LANGUAGE MODELING USING CLASS-DEPENDENT MIXTURES

Authors: Ivan Bulyko and Mari Ostendorf and Andreas Stolcke

Other assigned terms:

  • approach
  • baseline model
  • bigram
  • bigram language model
  • broadcast news
  • case
  • content words
  • conversational speech
  • conversational speech language
  • conversational telephone speech
  • corpora
  • entropy
  • exact match
  • frequency counts
  • function words
  • hypotheses
  • interpolation
  • language model
  • language model perplexity
  • language models
  • large vocabulary speech
  • latent semantic
  • linear combination
  • mapping
  • method
  • model perplexity
  • n-gram
  • n-gram language model
  • n-grams
  • part-of-speech
  • part-of-speech tags
  • pauses
  • perplexity
  • posterior
  • posterior probability
  • probabilities
  • probability
  • queries
  • recognition task
  • search strategy
  • semantic
  • sentence
  • sentence boundary
  • speaking style
  • statistics
  • style
  • switchboard training corpus
  • syntactic structure
  • tags
  • technique
  • terms
  • test set
  • text
  • text corpora
  • tokens
  • topics
  • training
  • training corpus
  • training data
  • training material
  • transcripts
  • trigram
  • user
  • user utterances
  • vocabulary
  • web corpus
  • web pages
  • web text
  • word
  • word error rates
  • word frequency
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***