ACL RD-TEC 1.0 Summarization of J92-4003
Paper Title:
CLASS-BASED N-GRAM MODELS OF NATURAL LANGUAGE
CLASS-BASED N-GRAM MODELS OF NATURAL LANGUAGE
Authors: Peter F. Brown and Peter V. deSouza and Robert L. Mercer and T. J. Watson and Vincent J. Della Pietra and Jenifer C. Lai
Primarily assigned technology terms:
- algorithm
- automatic speech recognition
- automatic spelling
- automatic spelling correction
- clustering
- computational linguistics
- computing
- em algorithm
- greedy algorithm
- grouping
- language processing
- likelihood estimate
- likelihood estimation
- machine translation
- markov model
- maximum likelihood
- maximum likelihood estimation
- natural language processing
- parameter estimation
- processing
- recognition
- recognition system
- sampling
- search
- speech recognition
- speech recognition system
- spelling
- spelling correction
- terminology
- translation system
- tree-building
- word clustering
Other assigned terms:
- 1-gram word distribution
- acoustic signal
- approach
- binary tree
- brown corpus
- case
- characters
- cluster
- clusters
- co-occurrence
- co-occurrence statistics
- coherence
- concept
- concepts
- conditional probabilities
- conditional probability
- distribution
- english text
- entropy
- estimation
- fact
- implementation
- joint probability
- language model
- language models
- language processing tasks
- likelihood
- linguistics
- mapping
- maps
- maximum likelihood estimate
- meaning
- measure
- method
- mutual information
- n-gram
- n-gram language model
- n-gram model
- n-gram models
- n-grams
- names
- natural language
- natural language processing tasks
- noisy channel
- pairs of words
- passage
- perplexity
- posteriori probability
- precision
- priori
- probabilities
- probability
- probability distribution
- process
- processing tasks
- proper names
- relative frequency
- semantic
- semantic classes
- semantic coherence
- signal
- statistics
- stem
- stems
- syntactic function
- technique
- term
- terms
- test data
- text
- theory
- training
- training data
- training text
- transition matrix
- tree
- vocabulary
- word
- word classes
- word distribution
- word strings
- word-based language model
- word-based model
- words