ACL RD-TEC 1.0 Summarization of P02-1024

Paper Title:
EXPLORING ASYMMETRIC CLUSTERING FOR STATISTICAL LANGUAGE MODELING

Authors: Jianfeng Gao and Joshua Goodman and Guihong Cao and Hang Li

Other assigned terms:

  • ambiguity
  • approach
  • array
  • asian language
  • asian language text
  • backoff
  • bigram
  • case
  • character error rate
  • characters
  • chinese language
  • chinese text
  • cluster
  • cluster number
  • clusters
  • comparative study
  • conditional probability
  • convergence
  • corpora
  • data sets
  • data sparseness
  • data sparseness problem
  • entropy
  • error rate
  • estimation
  • experimental results
  • ibm model
  • ibm models
  • independence assumption
  • japanese text
  • kanji
  • language model
  • language model probability
  • language models
  • leaf
  • lexicon
  • linguistics
  • meaning
  • method
  • methodology
  • model parameter
  • model parameters
  • model performance
  • model probability
  • model size
  • mutual information
  • n-gram
  • n-gram model
  • n-gram models
  • n-grams
  • newspaper corpus
  • orthography
  • parameter settings
  • perplexity
  • probabilities
  • probability
  • pruning threshold
  • research topic
  • root node
  • search space
  • sparseness problem
  • stochastic model
  • symbol
  • technique
  • terms
  • test set
  • testing data
  • text
  • text corpora
  • theory
  • training
  • training corpora
  • training data
  • training instance
  • transcript
  • tree
  • tree structure
  • tree structures
  • trees
  • trigram
  • trigram model
  • unigram
  • word
  • word n-gram model
  • word sequence
  • word string
  • word strings
  • word trigram
  • word trigram model
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***