ACL RD-TEC 1.0 Summarization of C04-1022
Paper Title:
AUTOMATIC LEARNING OF LANGUAGE MODEL STRUCTURE
AUTOMATIC LEARNING OF LANGUAGE MODEL STRUCTURE
Authors: Kevin Duh and Katrin Kirchhoff
Primarily assigned technology terms:
- analyzer
- approximation
- automatic learning
- clustering
- computing
- data-driven approach
- data-driven model section
- data-driven search
- disambiguation
- discounting method
- encoding
- exhaustive search
- factored language modeling
- genetic algorithms
- language modeling
- language modeling approach
- learning
- likelihood estimation
- maximum likelihood
- maximum likelihood estimation
- maximum-likelihood
- model section
- model selection
- modeling
- morphological analysis
- morphological analyzer
- morphological disambiguation
- morphology
- normalization
- optimization
- parameter reduction
- random selection
- recognition
- recognizer
- sampling
- search
- search\/optimization
- searching
- segmentation
- smoothing
- smoothing method
- smoothing technique
- smoothing techniques
- speech recognition
- speech recognizer
- statistical language modeling
- structure learning
- structure search
Other assigned terms:
- annotated corpus
- annotation
- approach
- case
- conditional independence
- conditional probabilities
- context-sensitive grammar
- convergence
- corpora
- data set
- data sets
- derivation
- development set
- distribution
- estimation
- evaluation set
- experimental results
- generation
- grammar
- grammar rules
- higher-order distribution
- implementation
- interpolation
- knowledge
- language model
- language model structure
- language models
- large text corpora
- lattice
- lexicon
- likelihood
- linguistic
- linguistic information
- linguistic knowledge
- method
- model parameters
- model structure
- morph
- morphological features
- morphological knowledge
- n-gram
- normalization factor
- opinions
- optimization criterion
- paragraph
- part-of-speech
- perplexity
- probabilities
- probability
- probability distributions
- probability estimates
- probability model
- procedure
- punctuation
- relation
- representations
- run-time
- russian
- search procedure
- search space
- selection operator
- semantic
- sparse data
- statistical language model
- statistical model
- stem
- stems
- syntactic features
- tags
- technique
- terms
- test data
- test set
- text
- text corpora
- tokens
- training
- training data
- training set
- transcriptions
- trigram
- trigram language model
- unigram
- vocabulary
- word
- word classes
- word features
- word sequence
- word trigram
- word types
- words