ACL RD-TEC 1.0 Summarization of W04-1612
Paper Title:
AUTOMATIC DIACRITIZATION OF ARABIC FOR ACOUSTIC MODELING IN SPEECH RECOGNITION
AUTOMATIC DIACRITIZATION OF ARABIC FOR ACOUSTIC MODELING IN SPEECH RECOGNITION
Authors: Dimitra Vergyri and Katrin Kirchhoff
Primarily assigned technology terms:
- acoustic model training
- acoustic modeling
- analysis tool
- analyzer
- automatic recognition
- clustering
- decoding
- diacritizer
- expectation-maximization
- grapheme-to-phoneme conversion
- graphical modeling
- hidden markov
- hidden markov models
- hmms
- linear regression
- matching
- maximum likelihood
- measuring
- mmie training
- model training
- modeling
- morphological analysis
- morphological analyzer
- morphological analyzers
- morphology
- normalization
- phonetic spelling
- phonetic transcription
- recognition
- recognition system
- recognizer
- regression
- rescoring
- romanization
- segmentation
- speech recognition
- speech recognizer
- speech recognizer training
- speech synthesis
- spelling
- stemmer
- synthesis
- tag assignment
- tagger
- tagging
- training procedure
- transcription
Other assigned terms:
- acoustic information
- acoustic model
- acoustic models
- acoustic signal
- alphabet
- ambiguity
- approach
- arabic text
- arabic treebank
- arabic treebank project
- benchmark
- bigram
- bigram language model
- bigram model
- broadcast news
- broadcast news data
- buckwalter morphological analysis scheme
- case
- cluster
- contextual information
- contextual knowledge
- convergence
- conversation
- corpora
- data set
- diacritization error rate
- dialectal speech
- dictionary
- error rate
- evaluations
- fact
- feature
- foreign words
- french
- hypothesis
- information source
- knowledge
- language model
- language models
- lattices
- lexical ambiguity
- lexicon
- likelihood
- linguistic
- linguistic information
- markov models
- method
- model size
- modeling toolkit
- morphological information
- morphological knowledge
- morphological structure
- n-gram
- nal position
- names
- nist
- noise
- orthographic transcription
- phonemes
- probabilities
- probability
- probability distributions
- procedure
- process
- pronunciation
- pronunciation dictionary
- proper names
- segments
- sentence
- sentences
- sequence probability
- signal
- speech data
- standard arabic
- stem
- stems
- substring
- symbols
- syntactic constraints
- syntactic context
- tag sequence
- tag sequence probability
- tag set
- tagging model
- tags
- tagset
- technique
- test data
- text
- toolkit
- training
- training data
- training material
- training samples
- transcriptions
- transition network
- treebank
- treebank project
- trigram
- trigram language model
- uniform probability
- utterance
- verb
- verb class
- vocabulary
- vocabulary size
- vocal tract
- vowel
- word
- word error rate
- word error rates
- word form
- word level
- word sequence
- words
- writing system