ACL RD-TEC 1.0 Summarization of H05-1057
Paper Title:
MATCHING INCONSISTENTLY SPELLED NAMES IN AUTOMATIC SPEECH RECOGNIZER OUTPUT FOR INFORMATION RETRIEVAL
MATCHING INCONSISTENTLY SPELLED NAMES IN AUTOMATIC SPEECH RECOGNIZER OUTPUT FOR INFORMATION RETRIEVAL
Authors: Hema Raghavan and James Allan
Primarily assigned technology terms:
- ad-hoc retrieval
- algorithm
- approximate string matching
- asr system
- automatic speech recognition
- automatic speech recognizer
- clustering
- computational linguistics
- database
- databases
- decoder
- detection and tracking
- document retrieval
- entity tagger
- expectation maximization
- extrinsic evaluation
- giza
- grouping
- human language
- human language technology
- ibm translation
- information retrieval
- intelligent information retrieval
- intrinsic evaluation
- isi rewrite decoder
- language modeling
- language processing
- language technology
- levenshtein
- link detection
- machine translation
- machine translation system
- matching
- modeling
- name finding
- named entity tagger
- natural language processing
- processing
- recognition
- recognizer
- retrieval track
- search
- search process
- speech recognition
- speech recognizer
- spelling
- spelling correction
- spoken document retrieval
- statistical machine translation
- stemmer
- story link detection
- string matching
- supervised method
- tagger
- taggers
- text editor
- topic detection
- topic detection and tracking
- translation process
- translation system
- transliteration
- unsupervised method
- vector space model
- word spotting
Other assigned terms:
- annotators
- approach
- asr output
- association for computational linguistics
- canonical form
- case
- characters
- cluster
- clusters
- co-reference
- community
- contextual information
- corpora
- detection task
- dictionary
- document
- document frequency
- document vectors
- edit distance
- error rate
- evaluation measures
- evaluations
- events
- feature
- foreign language
- french
- generative model
- generative models
- human judgments
- ibm models
- implementation
- information retrieval community
- inverse document frequency
- language model
- language modeling toolkit
- levenshtein distance
- lexicon
- linguistics
- machine translation model
- mapping
- mean average precision
- meaning
- measure
- measures
- method
- modeling toolkit
- named entities
- named entity
- names
- natural language
- nist
- noisy channel
- opinions
- pairs of words
- parallel corpus
- parallel text
- perplexity
- person names
- precision
- probabilistic model
- probabilities
- probability
- process
- proper names
- queries
- query
- recognition errors
- retrieval performance
- retrieval task
- sentence
- sentences
- signal
- size of the corpus
- source language
- source language text
- statistical significance
- string edit distance
- target language
- target language text
- technique
- technology
- term
- term frequency
- test corpus
- test set
- text
- toolkit
- topics
- training
- training corpus
- training set
- transcript
- transcriptions
- transcripts
- translation model
- translation models
- translation probabilities
- translations
- trec-7
- understanding
- user
- vector space
- vocabulary
- word
- word error rate
- word error rates
- words