ACL RD-TEC 1.0 Summarization of A94-1010

Paper Title:
IMPROVING LANGUAGE MODELS BY CLUSTERING TRAINING SENTENCES

Other assigned terms:

  • acoustic information
  • analogy
  • approach
  • backoff
  • baseline score
  • bigram
  • bigram model
  • case
  • cluster
  • clusters
  • coefficient
  • community
  • concepts
  • corpora
  • correlations
  • data sparseness
  • dialogue model
  • dialogues
  • distribution
  • entropy
  • euclidean distance
  • events
  • experimental results
  • fact
  • grammar
  • grammar rule
  • grammar rules
  • hypotheses
  • hypothesis
  • independence assumption
  • interpretation
  • knowledge
  • language engine
  • language model
  • language models
  • likelihood
  • linguistic
  • linguistic constraints
  • linguistic information
  • measure
  • method
  • model complexity
  • model parameters
  • n-gram
  • n-gram model
  • n-grams
  • names
  • natural language
  • paragraph
  • paragraphs
  • parse
  • parse tree
  • parsing table
  • perplexity
  • precision
  • probabilities
  • probability
  • probability distribution
  • probability distributions
  • process
  • query
  • relative frequency
  • semantic
  • semantic structures
  • sentence
  • sentences
  • similarity measure
  • size of the corpus
  • subcorpus
  • syntactic constructions
  • syntax
  • system performance
  • technique
  • test corpus
  • test data
  • training
  • training corpus
  • training data
  • tree
  • trigram
  • trigram model
  • understanding
  • unigram
  • unigram language model
  • unigram probability
  • user
  • utterance
  • vector space
  • vocabulary
  • word
  • word classes
  • word sequences
  • word strings
  • word-based evaluation
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***