ACL RD-TEC 1.0 Summarization of W06-3711
Paper Title:
IBM MASTOR SYSTEM: MULTILINGUAL AUTOMATIC SPEECH-TO-SPEECH TRANSLATOR
IBM MASTOR SYSTEM: MULTILINGUAL AUTOMATIC SPEECH-TO-SPEECH TRANSLATOR
Authors: Yuqing Gao and Bowen Zhou and Ruhi Sarikaya and Mohamed Afify and Hong-Kwang Kuo and Wei-zhong Zhu and Yonggang Deng and Charles Prosser and Wei Zhang and Laurent Besacier
Primarily assigned technology terms:
- acoustic modeling
- algorithm
- automatic speech recognition
- beam pruning
- classifier
- computing
- data collection
- decision-tree
- decoder
- decoding
- dictionary search
- discriminative training
- encoding
- finite state
- finite state transducer
- finite-state transducer
- hidden markov
- hidden markov model
- histogram pruning
- ibm viavoice engine
- language generation
- language processing
- language understanding
- machine translation
- markov model
- model training
- model training and decoding
- modeling
- morphological analysis
- morphological tokenization
- multilayer search
- natural language generation
- natural language understanding
- nlu
- normalization
- optimization
- parser
- phonetic transcription
- phrase segmentation
- phrase translation
- phrase-based translation
- phrase-based translation framework
- processing
- processor
- pruning
- recognition
- recognizer
- search
- search algorithm
- search process
- segmentation
- semantic parser
- speech recognition
- speech recognition and translation
- speech recognizer
- speech translation
- speech translation system
- speech-to-speech translation
- speech-to-speech translation system
- spelling
- splitting
- spontaneous speech recognition
- statistical approaches
- statistical machine translation
- statistical translation
- tokenization
- transcription
- transducer
- translation method
- translation system
- translator
- tts system
- user interface
- variable substitution
- viterbi
- viterbi decoder
- weighted finite-state transducer
- word generation
- world wide web
Other assigned terms:
- acoustic model
- acoustic models
- alphabet
- annotated corpora
- annotated corpus
- approach
- arabic language
- background model
- beam
- bigram
- bilingual corpus
- case
- colloquial speech
- composition
- concept
- concepts
- conditional probabilities
- continuous speech
- corpora
- data sparseness
- data sparseness problem
- device
- dialectal speech
- dictionaries
- dictionary
- distribution
- english language
- english language model
- entropy
- estimation
- fact
- foreign language
- generation
- grapheme
- histogram
- knowledge
- language model
- language pairs
- lattice
- lattices
- linguistic
- linguistic knowledge
- meaning
- method
- n-gram
- names
- natural language
- parallel corpus
- phoneme
- phonemes
- phrase
- prefixes and suffixes
- probabilities
- probability
- probability distributions
- procedure
- process
- pronunciation
- pronunciation dictionary
- recognition accuracy
- search space
- semantic
- sentence
- sentences
- source language
- source sentence
- sparseness problem
- speaking style
- speech corpora
- speech data
- speech recognition accuracy
- statistical approach
- statistical model
- statistical models
- stem
- style
- suffix
- suffixes
- target language
- technologies
- text
- training
- training data
- training material
- transcriptions
- translation accuracy
- translation model
- translation models
- translation problem
- translations
- trigram
- understanding
- user
- vocabulary
- vocabulary size
- word
- word sequence
- words