ACL RD-TEC 1.0 Summarization of P05-1056
Paper Title:
USING CONDITIONAL RANDOM FIELDS FOR SENTENCE BOUNDARY DETECTION IN SPEECH
USING CONDITIONAL RANDOM FIELDS FOR SENTENCE BOUNDARY DETECTION IN SPEECH
Authors: Yang Liu and Andreas Stolcke and Elizabeth Shriberg and Mary Harper
Primarily assigned technology terms:
- algorithm
- automatic recognition
- boundary detection
- classification
- classification method
- classifier
- classifiers
- computational linguistics
- conditional likelihood
- conditional random field
- conditional random fields
- crfs
- decision tree
- decision tree classifier
- decoding
- discriminative training
- evaluation tools
- factoring
- feature extraction
- forward-backward algorithm
- generative modeling
- generative modeling approach
- graphical model
- hidden markov
- hidden markov model
- hmm system
- information extraction
- language processing
- markov model
- matching
- maxent
- maximum entropy
- modeling
- normalization
- parsing
- part-of-speech tagging
- pos tagger
- probability interpolation
- processing
- recognition
- recognizer
- scoring
- segmentation
- sentence boundary detection
- sentence segmentation
- sequence tagging
- speech recognition
- stochastic process
- su detection
- summarization
- tagger
- taggers
- tagging
- text processing
- transcription
- tree classifier
- viterbi
- viterbi algorithm
- viterbi decoding
- voting
- word alignment
Other assigned terms:
- alignment information
- ambiguity
- annotation
- annotators
- approach
- association for computational linguistics
- binary features
- boundary-based error rate
- broadcast news
- broadcast news speech
- chunk
- classification accuracy
- classification error
- classification error rate
- classification performance
- conditional probability
- conversational speech
- conversational telephone speech
- corpora
- crf model
- data set
- dependency information
- detection task
- development set
- discriminative model
- distribution
- duration
- edit distance
- entropy
- error metric
- error rate
- estimation
- evaluation set
- events
- experimental results
- fact
- feature
- feature set
- gaussian prior
- generative model
- genre
- implementation
- interpolation
- joint probability
- knowledge
- labeling
- language models
- lexical features
- likelihood
- linguistics
- log-likelihood
- maxent model
- metadata
- method
- model parameters
- n-gram
- n-grams
- nist
- opinions
- part-of-speech
- part-of-speech tags
- penn treebank
- pitch
- pos sequence
- posterior
- posterior probability
- probabilities
- probability
- probability estimates
- process
- processing tasks
- pronouns
- prosodic feature
- prosodic features
- prosodic information
- prosody
- punctuation
- punctuation information
- recognition errors
- recognition task
- representations
- sentence
- sentence boundaries
- sentence boundary
- sentences
- sequence model
- set size
- sparse data
- sparse data problem
- speaking style
- speech prosody
- speech recognition errors
- speech recognition task
- standard deviation
- stems
- style
- switchboard corpus
- system performance
- tags
- term
- test data
- test set
- text
- text corpora
- text corpus
- textual information
- tokens
- toolkit
- training
- training and test data
- training corpus
- training data
- training set
- training set size
- training time
- transcriptions
- transition probabilities
- tree
- treebank
- word
- word boundaries
- word boundary
- word classes
- word error rates
- word features
- word information
- word sequence
- word string
- words