ACL RD-TEC 1.0 Summarization of J96-2003

Paper Title:
IMPROVING STATISTICAL LANGUAGE MODEL PERFORMANCE WITH AUTOMATICALLY GENERATED WORD HIERARCHIES

Authors: John G. McMahon and Francis J. Smith

Other assigned terms:

  • anaphoric reference
  • approach
  • association for computational linguistics
  • baseline model
  • benchmark
  • bigram
  • bigram model
  • binary tree
  • bottom-up approach
  • brown corpus
  • case
  • characters
  • class information
  • class membership
  • classification hierarchy
  • cluster
  • cluster evaluation
  • clusters
  • co-occurrence
  • cognitive
  • collocation
  • community
  • conditional probability
  • contour
  • corpora
  • distribution
  • entropy
  • evaluation method
  • evaluations
  • events
  • feature
  • grammar
  • implementation
  • interpolation
  • lambda
  • language data
  • language model
  • language model performance
  • language models
  • lexical structure
  • likelihood
  • likelihood probability
  • linguistic
  • linguistic phenomena
  • linguistics
  • lob corpus
  • mapping
  • markov model theory
  • method
  • model performance
  • model probability
  • model theory
  • mutual information
  • n-gram
  • n-gram models
  • n-grams
  • natural language
  • nouns
  • parse
  • part of speech
  • part-of-speech
  • parts of speech
  • performance comparison
  • perplexity
  • phoneme
  • phoneme string
  • pos information
  • probabilistic language model
  • probabilities
  • probability
  • probability estimate
  • probability estimates
  • process
  • pronoun
  • punctuation
  • representations
  • research topic
  • search space
  • semantic
  • semantic information
  • semantic structure
  • sentence
  • sentences
  • sparse data
  • sparse data problem
  • statistical language model
  • statistics
  • tag model
  • tagged corpus
  • tags
  • terms
  • test set
  • text
  • theory
  • tokens
  • training
  • training corpus
  • training data
  • training set
  • training text
  • transformation
  • tree
  • tree representation
  • trees
  • trigram
  • trigram language model
  • trigram model
  • unigram
  • untagged corpora
  • utterance
  • verb
  • vocabulary
  • vocabulary size
  • weighted average language model
  • word
  • word behavior
  • word classes
  • word frequencies
  • word strings
  • word types
  • word-based language model
  • word-class information
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***