ACL RD-TEC 1.0 Summarization of W04-3209
Paper Title:
COMPARING AND COMBINING GENERATIVE AND POSTERIOR PROBABILITY MODELS: SOME ADVANCES IN SENTENCE BOUNDARY DETECTION IN SPEECH
COMPARING AND COMBINING GENERATIVE AND POSTERIOR PROBABILITY MODELS: SOME ADVANCES IN SENTENCE BOUNDARY DETECTION IN SPEECH
Authors: Yang Liu and Andreas Stolcke and Elizabeth Shriberg and Mary Harper
Primarily assigned technology terms:
- acoustic sentence segmentation
- algorithm
- automatic recognition
- automatic speech transcription
- bagging
- boundary classification
- boundary detection
- chunker
- chunking
- classification
- classifier
- classifiers
- coding
- combined classifier
- computational linguistics
- conditional likelihood
- conditional random field
- crfs
- decision tree
- decision tree classifier
- decision trees
- encoding
- error reduction
- estimation method
- estimator
- evaluation tools
- feature extraction
- feature selection
- forward-backward algorithm
- generative modeling
- generative modeling approach
- graphical model
- hidden markov
- hidden markov model
- hmm system
- hmms
- information extraction
- information processing
- l-bfgs parameter estimation
- language model training
- language processing
- likelihood estimation
- linear interpolation
- markov model
- matching
- maxent
- maxent classifier
- maximum entropy
- maximum entropy approach
- maximum likelihood
- maximum likelihood estimation
- model estimation
- model interpolation
- model training
- modeling
- n-gram estimation
- natural language processing
- nlp
- normalization
- parameter estimation
- parsing
- posterior probability estimation
- probability estimation
- processing
- processing technology
- prosodic modeling
- recognition
- recognizer
- segmentation
- sentence boundary classification
- sentence boundary detection
- sentence segmentation
- sentence-boundary detection
- sequence modeling
- smoothing
- speech production
- speech recognition
- speech recognizer
- speech transcription
- speech transcription technology
- stochastic process
- su detection
- summarization
- tagger
- taggers
- tagging
- thresholding
- tnt tagger
- transcription
- tree classifier
- tuning
- word recognition
Other assigned terms:
- acoustic signal
- ambiguity
- annotators
- approach
- baseline model
- benchmark
- bigram
- binary feature
- binary features
- broadcast news
- broadcast news data
- case
- chunk
- chunk tag
- chunk type
- classification accuracy
- classification error
- coding scheme
- community
- conditional markov model
- conditional probability
- conditional probability model
- continuous speech
- conversational speech
- conversational telephone speech
- corpora
- detection task
- development set
- distribution
- duration
- edit distance
- entropy
- entropy formulation
- error metric
- error rate
- estimation
- evaluation set
- event sequence likelihood
- events
- experimental results
- extraction evaluation
- fact
- feature
- feature vector
- genre
- hmm model
- hypotheses
- hypothesis
- index
- information sources
- interpolation
- intonation
- joint probability
- knowledge
- labeling
- language model
- language models
- lexical feature
- lexical features
- lexical information
- likelihood
- linguistics
- maps
- markov sequence
- maxent model
- meaning
- metadata
- method
- model structure
- n-gram
- n-grams
- natural language
- nist
- nlp tasks
- opinions
- part-ofspeech
- pause
- pause duration
- pitch
- posterior
- posterior probability
- probabilities
- probability
- probability estimates
- probability model
- procedure
- process
- prosodic feature
- prosodic features
- prosodic information
- prosodic model
- prosodic structure
- prosody
- punctuation
- punctuation information
- recognition errors
- relative error reduction
- representations
- sentence
- sentence boundaries
- sentence boundary
- sequence model
- signal
- speaker change
- speaking style
- speech prosody
- speech recognition errors
- spoken language
- stems
- style
- switchboard corpus
- syntactic structure
- system performance
- tags
- technique
- technology
- term
- test data
- test set
- text
- textual information
- tokens
- trained model
- training
- training and test data
- training data
- training set
- transcriptions
- transcripts
- transition probabilities
- tree
- trees
- wall street journal corpus
- word
- word boundaries
- word boundary
- word classes
- word error rates
- word features
- word information
- word sequence
- word string
- words