ACL RD-TEC 1.0 Summarization of H94-1016
Paper Title:
ON USING WRITTEN LANGUAGE TRAINING DATA FOR SPOKEN LANGUAGE MODELING
ON USING WRITTEN LANGUAGE TRAINING DATA FOR SPOKEN LANGUAGE MODELING
Authors: R. Schwartz and L. Nguyen and F. Kubala and G. Chou and G. Zavaliagkos and J. Makhoul
Primarily assigned technology terms:
Other assigned terms:
- abbreviations
- acoustic model
- acoustic models
- ambiguity
- approach
- case
- continuous speech
- data consortium
- development set
- dictionary
- distribution
- error rate
- evaluation test
- evaluations
- events
- fact
- frequency list
- grammar
- language model
- language model probability
- lexicon
- linguistic
- linguistic data
- linguistic data consortium
- markov chain
- method
- model probability
- names
- nouns
- perplexity
- phoneme
- phoneme sequence
- phonemes
- preprocessor
- probabilities
- probability
- process
- punctuation
- recognition accuracy
- relative frequency
- sentence
- sentences
- speech recognition accuracy
- spoken language
- style
- test corpus
- test data
- testing data
- text
- text corpus
- tipster corpus
- topics
- training
- training data
- training material
- training text
- transcribed speech
- transcriptions
- trigram
- trigram language model
- unigram
- unigram model
- vocabulary
- vocabulary size
- word
- word error rate
- word frequencies
- word frequency
- word sequences
- words
- wsj corpus