ACL RD-TEC 1.0 Summarization of N06-2037

Paper Title:
SELECTING RELEVANT TEXT SUBSETS FROM WEB-DATA FOR BUILDING TOPIC SPECIFIC LANGUAGE MODELS

Authors: Abhinav Sethy and Panayiotis Georgiou and Shrikanth Narayanan

Other assigned terms:

  • association for computational linguistics
  • bigram
  • bleu
  • bleu metric
  • case
  • conversational speech
  • conversational speech language
  • corpora
  • data model
  • distribution
  • distributional similarity
  • entropy
  • error rate
  • estimation
  • evaluations
  • experimental results
  • fact
  • interpolation
  • language model
  • language models
  • learning problem
  • likelihood
  • linguistics
  • maximum likelihood estimate
  • measures
  • method
  • model parameters
  • n-gram
  • n-gram language model
  • n-grams
  • nlp applications
  • noise
  • performance comparison
  • permutation
  • perplexity
  • perplexity reduction
  • probabilities
  • probability
  • process
  • queries
  • query
  • sentence
  • sentence similarity
  • sentences
  • similarity measures
  • statistical models
  • style
  • technology
  • term
  • terms
  • test set
  • text
  • text corpus
  • training
  • trigram
  • unigram
  • unlabeled examples
  • vocabulary
  • vocabulary size
  • word
  • word error rate
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***